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Abstract. In many practical application domains, the software is organized into a set of 
threads, whose activation is exclusive and controlled by a cooperative scheduling policy: 
threads execute, without any interruption, until they either terminate or yield the control 
explicitly to the scheduler. 

The formal verification of such software poses significant challenges. On the one side, 
each thread may have infinite state space, and might call for abstraction. On the other 
side, the scheduling policy is often important for correctness, and an approach based on 
abstracting the scheduler may result in loss of precision and false positives. Unfortunately, 
the translation of the problem into a purely sequential software model checking problem 
turns out to be highly inefficient for the available technologies. 

We propose a software model checking technique that exploits the intrinsic structure of 
these programs. Each thread is translated into a separate sequential program and explored 
symbolically with lazy abstraction, while the overall verification is orchestrated by the 
direct execution of the scheduler. The approach is optimized by filtering the exploration 
of the scheduler with the integration of partial-order reduction. 

The technique, called ESST (Explicit Scheduler, Symbolic Threads) has been imple- 
mented and experimentally evaluated on a significant set of benchmarks. The results 
demonstrate that ESST technique is way more effective than software model checking ap- 
plied to the sequentialized programs, and that partial-order reduction can lead to further 
performance improvements. 



In many practical application domains, the software is organized into a set of threads that 
are activated by a scheduler implementing a set of domain-specific rules. Particularly rele- 
vant is the case of multi-threaded programs with cooperative scheduling, shared-variables and 
with mutually- exclusive thread execution. With cooperative scheduling, there is no preemp- 
tion: a thread executes, without interruption, until it either terminates or explicitly yields 
the control to the scheduler. This programming model, simply called cooperative threads 
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in the following, is used in several software paradigms for embedded systems (e.g., Sys- 
temC |Upe05| , FairTh reads |BouQ 6j, OSEK/VDX |OSE05| . SpecC [GDPGOlj ^ and also 
in other domains (e.g., CGM+98] ). 



Such applications are often critical, and it is thus important to provide highly effective 
verification techniques. In this paper, we consider the use of formal techniques for the 
verification of cooperative threads. We face two key difficulties: on the one side, we must 
deal with the potentially infinite state space of the threads, which often requires the use of 
abstractions; on the other side, the overall correctness often depends on the details of the 
scheduling policy, and thus the use of abstractions in the verification process may result in 
false positives. 

Unfortunately, the state of the art in verification is unable to deal with such chal- 
lenges. Previous attempts to apply various software model checking techniques to co- 
operative threads (in specific domains) have demonstrated limited effectiveness. For ex- 
ample, techinques like [KS05, TCMM07, CJK07J abstract away significant aspects of the 
scheduler and synchronization primitives, and thus they may report too many false posi- 
tives, due to loss of precision, and their applicability is also limited. Symbolic techniques, 
like [MMMC05 , HFG08] , show poor scalability because too many details of the scheduler are 
included in the model. Explicit-state techniques, like [CCNRiT] . are effective in handling 
the details of the scheduler and in exploring possible thread inter leavings, but are unable 
to counter the infinite nature of the state space of the threads [GV04J. Unfortunately, for 
explicit-state techniques, a finite-state abstraction is not easily available in general. 

Another approach could be to reduce the verification of cooperative threads to the 
verification of sequential programs. This approach relies on a translation from (or se- 
quentialization of) the cooperative threads to the (possibly non-deterministic) sequential 
programs that contain both the mapping of the threads in the form of functions and the 
encoding of the scheduler. The sequentialized program can be analyzed by means of "off- 
the-shelf" software model checking techniques, such as [CKSY051 IMcM06t IBHJM07|, that 
are based on the counter-example guided abstraction refinement (CEGAR) [CGJ + 03] par- 
adigm. However, this approach turns out to be problematic. General purpose analysis 
techniques are unable to exploit the intrinsic structures of the combination of scheduler and 
threads, hidden by the translation into a single program. For instance, abstraction-based 
techniques are inefficient because the abstraction of the scheduler is often too aggressive, 
and many refinements are needed to re- introduce necessary details. 

In this paper we propose a verification technique which is tailored to the verification 
of cooperative threads. The technique translates each thread into a separate sequential 
program; each thread is analyzed, as if it were a sequential program, with the lazy predicate 
abstraction approach [HJMS02, BHJM07]. The overall verification is orchestrated by the 
direct execution of the scheduler, with techniques similar to explicit-state model checking. 
This technique, in the following referred to as Explicit- Scheduler /Symbolic Threads (ESST) 
model checking, lifts the lazy predicate abstraction for sequential software to the more 
general case of multi-threaded software with cooperative scheduling. 

Furthermore, we enhance ESST with partial-order reduction [God96 ; Pcl93. Val91|. In 
fact, despite its relative effectiveness, ESST often requires the exploration of a large number 
of thread inter leavings, many of which are redundant, with subsequent degradations in the 
run time performance and high memory consumption [CMNRlOj. POR essentially exploits 
the commutativity of concurrent transitions that result in the same state when they are ex- 
ecuted in different orders. We integrate within ESST two complementary POR techniques, 
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persistent sets and sleep sets. The POR techniques in ESST limit the expansion of the 
transitions in the explicit scheduler, while leave the nature of the symbolic analysis of the 
threads unchanged. The integration of POR in ESST algorithm is only seemingly trivial, 
because POR could in principle interact negatively with the lazy predicate abstraction used 
for analyzing the threads. 

The ESST algorithm has been implemented within the Kratos software model checker 
CGM + lT] . Kratos has a generic structure, encompassing the cooperative threads frame- 
work, and has been specialized for the verification of SystemC programs |Ope05| and of 
FairThreads programs [Bou06| . Both SystemC and FairThreads fall within the paradigm 
of cooperative threads, but they have significant differences. This indicates that the ESST 
approach is highly general, and can be adapted to specific frameworks with moderate effort. 
We carried out an extensive experimental evaluation over a significant set of benchmarks 
taken and adapted from the literature. We first compare ESST with the verification of 
sequentialized benchmarks, and then analyze the impact of partial-order reduction. The 
results clearly show that ESST dramatically outperforms the approach based on sequen- 
tialization, and that both POR techniques are very effective in further boosting the per- 
formance of ESST. 

This paper presents in a general and coherent manner material from [CMNR10J and 
from |CNR11| . While in [CMNRlOj and in jCNRllj the focus is on SystemC, the frame- 
work presented in this paper deals with the general case of cooperative threads, without 
focussing on a specific programming framework. In order to emphasize the generality of the 
approach, the experimental evaluation in this paper has been carried out in a completely 
different setting than the one used in [CMNR10] and in [CNRllJ, namely the FairThreads 
programming framework. We also considered a set of new benchmarks from [Bou06 and 
from [WH08], in addition to adapting some of the benchmarks used in [CNRll] to the 
FairThreads scheduling policy. We also provide proofs of correctness of the proposed tech- 
niques in Appendix lAl 

The structure of this paper is as follows. Section[2]provides some background in software 
model checking via the lazy predicate abstraction. Section [3] introduces the programming 
model to which ESST can be applied. Section 0] presents the ESST algorithm. Section [5] 
explains how to extend ESST with POR techniques. Section [6] shows the experimental 
evaluation. Section [7] discusses some related work. Finally, Section [8] draws conclusions and 
outlines some future work. 

2. Background 

In this section we provide some background on software model checking via the lazy predi- 
cate abstraction for sequential programs. 

2.1. Sequential Programs. We consider sequential programs written in a simple impera- 
tive programming language over a finite set Var of integer variables, with basic control-flow 
constructs (e.g., sequence, if-then-else, iterative loops) where each operation is either an 
assignment or an assumption. An assignment is of the form x := exp, where a; is a variable 
and exp is either a variable, an integer constant, an explicit nondeterministic construct *, 
or an arithmetic operation. To simplify the presentation, we assume that the considered 
programs do not contain function calls. Function calls can be removed by inlining, under 
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the assumption that there are no recursive calls (a typical assumption in embedded soft- 
ware). An assumption is of the form [bexp], where bexp is a Boolean expression that can 
be a relational operation or an operation involving Boolean operators. Subsequently, we 
denote by Ops the set of program operations. 

Without loss of generality, we represent a program P by a control- flow graph (CFG). 

Definition 2.1 (Control-Flow Graph). A control-flow graph G for a program P is a tuple 
(L, E,Iq, L err ) where 

(1) L is the set of program locations, 

(2) E C L x Ops x L is the set of directed edges labelled by a program operation from the 
set Ops, 

(3) Iq G L is the unique entry location such that, for any location I G L and any operation 
op G Ops, the set E does not contain any edge {I, op, lo), and 

(4) L err C L of is the set of error locations such that, for each l e G L err , we have (l e , op, I) G" 

for all op G Ops and for all / 6 L. 

In this paper we are interested in verifying safety properties by reducing the verification 
problem to the reachability of error locations. 

Example 2.2. Figure [1] depicts an example of a CFG. Typical program assertions can be 
represented by branches going to error locations. For example, the branches going out of Iq 
can be the representation of assert (y >= 0). 

A state s of a program is a mapping from variables to their values (in this case integers) . 
Let State be the set of states, we have s G State = Var — > 7L. We denote by Dom(s) the 
domain of a state s. We also denote by s[x\ ^ v\, . . . ,x n ^ v n ] the state obtained from 
s by substituting the image of X{ in s by Vi for all i = 1, . . . ,n. Let G = (L, E, l§,L eTT ) 
be the CFG for a program P. A configuration 7 of P is a pair (I, s), where I G L and s 
is a state. We assume some first-order language in which one can represent a set of states 
symbolically. We write s \= <p to mean the formula <p is true in the state s, and also say 
that s satisfies ip, or that (p holds at s. A data region r C State is a set of states. A data 
region r can be represented symbolically by a first-order formula ip r , with free variables 
from Var, such that all states in r satisfy ip r ; that is, r = {s | s (= ip r }. When the context is 
clear, we also call the formula (p r data region as well. An atomic region, or simply a region, 



EXPLICIT-SCHEDULER SYMBOLIC-THREAD 



5 



is a pair (l,<p), where I € L and ip is a data region, such that the pair represents the set 
{(I, s) | s \= ip} of program configurations. When the context is clear, we often refer to the 
both kinds of region as simply region. 

The semantics of an operation op E Ops can be defined by the strongest post-operator 
SP op . For a formula <p representing a region, the strongest post- condition SP op (p) represents 
the set of states that are reachable from any of the states in the region represented by <p after 
the execution of the operation op. The semantics of assignment and assumption operations 
£1X6 £LS follows: 

SP x:=exp (p) = 3x'.ip[x/x'] A (x = exp[x/x']), for exp ^ *, 
SP x:=Jf {p) = 3x'.ip[x/x'] A (x = a), where a is a fresh variable, and 
SPybexp]^) = <f/\bexp, 

where <p[x/x'] and exp[x/x'], respectively, denote the formula obtained from ip and the 
expression obtained from exp by replacing the variable x' for x. We define the application 
of the strongest post-operator to a finite sequence a = op±, . . . ,op n of operations as the 
successive application of the strongest post-operator to each operator as follows: SP a (p) = 
SP opn (...SP opi (p)...). 



2.2. Predicate Abstraction. A program can be viewed as a transition system with tran- 
sitions between configurations. The set of configurations can potentially be infinite because 
the states can be infinite. Predicate abstraction [GS97 is a technique for extracting a finite 
transition system from a potentially infinite one by approximating possibly infinite sets of 
states of the latter system by Boolean combinations of some predicates. 

Let IT be a set of predicates over program variables in some quantifier-free theory T ■ A 
precision ir is a finite subset of II. A predicate abstraction pF of a formula (p over a precision 
7r is a Boolean formula over it that is entailed by p in T, that is, the formula (p => p w is 
valid in T ■ To avoid losing precision, we are interested in the strongest Boolean combination 
ip w , which is called Boolean predicate abstraction [LNO06]. As described in |LNO06| . for a 
formula tp, the more predicates we have in the precision it, the more expensive the computa- 
tion of Boolean predicate abstraction. We refer the reader to [LNO06t ICCF + 07 ICDJ R09 



for the descriptions of advanced techniques for computing predicate abstractions based on 
Satisfiability Modulo Theory (SMT) [BSST09] . 

Given a precision it, we can define the abstract strongest post-operator SP^ p for an oper- 
ation op. That is, the abstract strongest post- condition SP^ p (ip) is the formula (SP op (ip)) n . 

2.3. Predicate- Abstraction based Software Model Checking. One prominent soft- 
ware model checking technique is the lazy predicate abstraction [BHJM07] technique. This 
technique is a counter-example guided abstraction refinement (CEGAR) |CGJ + 03| tech- 
nique based on on-the-fly construction of an abstract reachability tree (ART). An ART 
describes the reachable abstract states of the program: a node in an ART is a region (/, ip) 
describing an abstract state. Children of an ART node (or abstract successors) are obtained 
by unwinding the CFG and by computing the abstract post-conditions of the node's data 
region with respect to the unwound CFG edge and some precision ir. That is, the abstract 
successors of a node (/, p) is the set {(h, pi), ■ ■ ■ , (l n , Vn)}> where, for i = 1, . . . , n, we have 
(l,opi,li) is a CFG edge, and ip% = SP^(p) for some precision 7Tj. The precision 7Tj can be 
associated with the location Zj or can be associated globally with the CFG itself. The ART 
edge connecting a node (I, ip) with its child (l',(p') is labelled by the operation op of the 
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CFG edge (I, op, I'). In this paper computing abstract successors of an ART node is also 
called node expansion. An ART node (I, p) is covered by another ART node (/', ip') if / = I' 
and <p entails p' . A node (I, <p) can be expanded if it is not covered by another node and its 
data region ip is satisfiable. An ART is complete if no further node expansion is possible. 
An ART node (I, <p) is an error node if p is satisfiable and I is an error location. An ART 
is safe if it is complete and does not contain any error node. Obtaining a safe ART implies 
that the program is safe. 

The construction of an ART for a the CFG G = (L, E, lo,L err ) for a program P starts 
from its root (Zq,T). During the construction, when an error node is reached, we check if 
the path from the root to the error node is feasible. An ART path p is a finite sequence 
£i, . . . , e n of edges in the ART such that, for every i = 1, . . . , n — 1, the target node of 
is the source node of £i+i- Note that, the ART path p corresponds to a path in the CFG. 
We denote by a p the sequence of operations labelling the edges of the ART path p. A 
counter-example path is an ART path E\, . . . , e n such that the source node of E\ is the root 
of the ART and the target node of e n is an error node. A counter-example path p is feasible 
if and only if SP ap {true) is satisfiable. An infeasible counter-example path is also called 
spurious counter-example. A feasible counter-example path witnesses that the program P 
is unsafe. 

An alternative way of checking feasibility of a counter-example path p is to create a 
path formula that corresponds to the path. This is achieved by first transforming the se- 
quenc e a p = op i, . . . ,op n of operations labelling p into its single-static assignment (SSA) 
form [CFR + 91 , where there is only one single assignment to each variable. Next, a con- 



straint for each operation is generated by rewriting each assignment x := exp into the 
equality x = exp, with nondeterministic construct * being translated into a fresh variable, 
and turning each assumption [bexp] into the constraint bexp. The path formula is the con- 
junction of the constraint generated by each operation. A counter-example path p is feasible 
if and only if its corresponding path formula is satisfiable. 

Example 2.3. Suppose that the operations labelling a counter-example path are 

x := y, [x > 0], x := x + 1, y := x, [y < 0], 

then, to check the feasibility of the path, we check the satisfiability of the following formula: 

xi = yo A xi > A x 2 = xi + 1 A yi = x 2 A yi < 0. 

If the counter-example path is infeasible, then it has to be removed from the constructed 
ART by refining the precisions. Such a refinement amounts to analyzing the path and 
extracting new predicates from it. One successful method for extracting relevant predicates 
at certain locations of the CFG is based on the computation of Craig interpolants |Cra57| . as 
shown in [HJMM04J. Given a pair of formulas (p~ , p + ) such that ip~~ A p + is unsatisfiable, 
a Craig interpolant of (<p~,p + ) is a formula ip such that <p~ ip is valid, ip A p + is 
unsatisfiable, and ip contains only variables that are common to both p~ and <p + . Given 
an infeasible counter-example p, the predicates can be extracted from interpolants in the 
following way: 

(1) Let Op = op\, . . . , op n , and let the sub-path a l p ' J such that i < j denote the sub-sequence 
opi,op i+ i, . . . ,opj of a p . 

(2) For every k = 1, ... ,n — 1, let p 1,k be the path formula for the sub-path a p ,k and 

k 

(p l ' k ,p k+1 > n 



p k+1 ' n be the path formula for the sub-path a p + ' n , we generate an interpolant ip k of 
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Figure 2: Programming model. 



(3) The predicates are the (un-SSA) atoms in the interpolant ifj k for k = 1, . . . , n. 
The discovered predicates are then added to the precisions that are associated with some 
locations in the CFG. Let p be a predicate extracted from the interpolant ij) k of (v? 1,fc , ip k+1,n ) 
for 1 < k < n. Let e\ , . . . , e n be the sequence of edges labelled by the operations op\ , . . . , op n , 
that is, for i = 1, ... ,n, the edge £j is labelled by opi. Let the nodes (I, tp) and (/', <//) be 
the source and target nodes of the edge . The predicate p can be added to the precision 
associated with the location I'. 

Once the precisions have been refined, the constructed ART is analyzed to remove the 
sub part containing the infeasible counter-example path, and then the ART is reconstructed 
using the refined precisions. 

Lazy predicate abstraction has been implemented in several software model checkers, 
including Blast |BHJM07j . CpaChecker [BKllj . and Kratos [CGM+llj . For details 



and in-depth illustrations of ART constructions, we refer the reader to }BH JM07] . 



3. Programming Model 

In this paper we analyze shared-variable multi-threaded programs with exclusive thread 
(there is at most one running thread at a time) and cooperative scheduling policy (the 
scheduler never preempts the running thread, but waits until the running thread coopera- 
tively yields the control back to the scheduler). At the moment we do not deal with dynamic 
thread creations. This restriction is not severe because typically multi-threaded programs 
for embedded system designs are such that all threads are known and created a priori, and 
there are no dynamic thread creations. 

Our programming model is depicted in Figure [2 It consists of three components: a so- 
called threaded sequential program, a scheduler, and a set of primitive functions. A threaded 
sequential program (or threaded program) P is a multi-threaded program consisting of a set 
of sequential programs T\ , . . . , T/v such that each sequential program Tj represent a thread. 
From now on, we will refer to the sequential programs in the threaded programs as threads. 
We assume that the threaded program has a main thread, denoted by main, from which 
the execution starts. The main thread is responsible for initializing the shared variables. 

Let P be a threaded program, we denote by G Var the set of shared (or global) variables 
of P and by LVarx the set of local variables of the thread T in P. We assume that 
LVarx H GVar = for every thread T and LVar^ H LVar^ = for each two threads T{ 
and Tj such that i ^ j. We denote by Gt the CFG for the thread T. All operations in Gt 
only access variables in LVarx U GVar. 

The scheduler governs the executions of threads. It employs a cooperative scheduling 
policy that only allows at most one running thread at a time. The scheduler keeps track of a 
set of variables that are necessary to orchestrate the thread executions and synchronizations. 
We denote such a set by SVar. For example, the scheduler can keep track of the states 
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of threads and events, and also the time delays of event notifications. The mapping from 
variables in SVar to their values form a scheduler state. Passing the control to a thread can 
be done, for example, by simply setting the state of the thread to running. Such a control 
passing is represented by the dashed line in Figure [2J 

Primitive functions are special functions used by the threads to communicate with the 
scheduler by querying or updating the scheduler state. To allow threads to call primitive 
functions, we simply extend the form of assignment described in Section [2.11 as follows: the 
expression exp of an assignment x := exp can also be a call to a primitive function. We 
assume that such a function call is the top-level expression exp and not nested in another 
expression. Calls to primitive functions do not modify the values of variables occurring in 
the threaded program. Note that, as primitive function calls only occur on the right-hand 
side of assignment, we implicitly assume that every primitive function has a return value. 

The primitive functions can be thought of as a programming interface between the 
threads and the scheduler. For example, for event-based synchronizations, one can have a 
primitive function wait_event(e) that is parametrized by an event name e. This function 
suspends the calling thread by telling the scheduler that it is now waiting for the notification 
of event e. Another example is the function notify_event(e) that triggers the notification 
of event e by updating the event's state, which is tracked by the scheduler, to a value 
indicating that it has been notified. In turn, the scheduler can wake up the threads that 
are waiting for the notification of e by making them runnable. 

We now provide a formal semantics for our programming model. Evaluating expressions 
in program operations involves three kinds of state: 

(1) The state Sj of local variables of some thread T, (Dom(s{) = LVar^)- 

(2) The state gs of global variables (Dom(gs) = GVar). 

(3) The scheduler state § (Dom(§) = SVar). 

The evaluation of the right-hand side expression of an assignment requires a scheduler state 
because the expression can be a call to a primitive function whose evaluation depends on 
and can update the scheduler state. 

We require, for each thread T, there is a variable sir £ Dom(S) that indicates the state 
of T. We consider the set {Running, Runnable, Waiting} as the domain of stx, where each 
element in the set has an obvious meaning. The elements Running, Runnable, and Waiting 
can be thought of as enumerations that denote different integers. We say that the thread 
T is running, runnable, or waiting in a scheduler state § if 8(s£*r) is, respectively, Running, 
Runnable, or Waiting. We denote by SState the set of all scheduler states. Given a threaded 
program with N threads T\, . . . , T/v, by the exclusive running thread property, we have, for 
every state § G SState, if, for some i, we have S(s^) = Running, then S(stj^) ^ Running 
for all j ^ i, where 1 < i, j < N. 

The semantics of expressions in program operations are given by the following two 
evaluation functions 



The function [-Jg takes as arguments an expression occurring on the right-hand side of 
an assignment and the above three kinds of state, and returns the value of evaluating the 
expression over the states along with the possible updated scheduler state. The function 
takes as arguments a boolean expression and the local and global states, and returns 
the valuation of the boolean expression. Figure [3] shows the semantics of expressions in 




exp -> ((State x State x SState) ->(Zx SState)) 
bexp — > ((State x State x SState) — > {true, false}). 
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Variable 


[a;]]g(s, gs, §) = (v,§), where v — s(x) if x e Dom(s) or v — gs(x) if 
x G Dom(gs). 


Integer constant 


[c]e(8,ga,S) = (c,S). 


Nondeterministic 
construct 


{*} e(s, gs, §) = (v,§), for some v € Z. 


Binary arithmetic 
operation 


\exp\®exp2\s{s, gs,E>) = (vl<S>v2, §), where vl = projx(\exp\\s(s, gs, S)) 
and i>2 = proji ([exp 2 ]g(s, 5 s , §))■ 


Primitive 
function call 


[/(ea;pi,...,ea;p„)](s,5s,S) = («,§'). where («,§') = /'(wi, ...,«„, B) 
and =proji[expi]g(s,5s,S), for i = 1, . . . , n. 


Relational opera- 
tion 


[expi exp 2 ]B(s,gs,S) = vl w2, where ul = proji ([eirpijg (s, #s, §)) 
and u2 = proji ([exp 2 ]f (s, gs, S)). 


Binary boolean 
operation 


[fcexpi ★ 6ea;p2]B(s, 5S, §) = td * u2, where vl = [6ea;pi]e(s, gs, S) and 
v2 = lbexp 2 }B(s,gs,§,). 



Figure 3: Semantics of expressions in program operations. 



program operations given by the evaluation functions [-Jg and f-Js- To extract the result 
of evaluation function, we use the standard projection function proji to get the i-th value 
of a tuple. The rules for unary arithmetic operations and unary boolean operations can 
be denned similarly to their binary counterparts. For primitive functions, we assume that 
every ra-ary primitive function / is associated with an (n + l)-ary function /' such that the 
first n arguments of /' are the values resulting from the evaluations of the arguments of /, 
and the (n + l)-th argument of /' is a scheduler state. The function /' returns a pair of 
value and updated scheduler state. 

Next, we define the meaning of a threaded program by using the operational semantics 
in terms of the CFGs of the threads. The main ingredient of the semantics is the notion 
of run-time configuration. Let Gt = (L, E,Iq, L err ) be the CFG for a thread T. A thread 
configuration 77 1 of T is a pair (I, s), where I £ L and s is a state such that Dom{s) = LVarx- 

Definition 3.1 (Configuration). A configuration 7 of a threaded program P with N threads 
Ti, . . . ,T N is a tuple (7^, . . .,j TN ,gs,S) where 

• each 7r 4 is a thread configuration of thread Tj, 

• gs is the state of global variables, and 

• S is the scheduler state. 

For succinctness, we often refer the thread configuration 7^ = (I, s) of the thread as 
the indexed pair (Z,s)j. A configuration (7^,... ,"fT N , <?s,S), is an initial configuration for 
a threaded program if for each i = 1, . . . , N, the location I of 77; = (I, s) is the entry of the 
CFG Gt { of Ti, and S(st ma i n ) = Running and S(str i ) 7^ Running for all Tj 7^ main. 

Let SStateNo C SState be the set of scheduler states such that every state in SStateNo 
has no running thread, and SStateone C SState be the set of scheduler states such that 
every state in SStateone has exactly one running thread. A scheduler with a cooperative 
scheduling policy can simply be defined as a function Sched : SStateNo f '(SState One)- 

The transitions of the semantics are of the form 

Edge transition: 7 7' 

Scheduler transition: 7 — > 7' 

where 7,7' are configurations and op is the operation labelling an edge. Figure H] shows the 
semantics of threaded programs. The first three rules show that transitions over edges of the 
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= (L, E, Iq, L err ) (I, [bexp],l') G E S(stTi) — Running [[6earp]]s(s, gs, S) = trite 



(1) 



(7 Tl ,...,(/, s)i, ... , 7t„ , 5s, S) ^ P (7Ti , ■ • ■ , s)i, ■ • ■ , 1t n , gs, S) 



G Tl = (L, E, l , L err ) (I, x := exp, I') e E 

\x := exp]g(s,3s,S) = (v, S') s' = s[x H > u] 



S(stTi) = Running 
x G LVarTi 



(2) 



(7Tx , • • • , 0, s)i, . . . , 7rv, ffs, §) -4 xp (7 Tl , • • • , (Z', s') 



. . . , 7Tjv , ffs, §') 



Gt; = (L, E,Iq, L err ) (l,x :— exp, I') G §(s£r 4 ) = Running 

\x := exp\s(s, gs, S) = («,§') 5s' = <?s[a; i-> i>] a; G GVar 



(3) 



(7Ti , ■ • ■ , (Z, • • • , 7tv, 5s, S) s ' 4 xp (7 Tl , . . . , (Z', s)i, . . . , j Tn , gs', §') 



Vi.S(stxj) 7^ Running §' G Sched(S) 



(4) 



(7Tx , • • • , 7t„ , 9s, S) ^ (7ti , • • • , 7t„ , 5«, §' 



Figure 4: Operational semantics of threaded sequential programs. 



CFG Gt of a thread T are defined if and only if T is running, as indicated by the scheduler 
state. The first rule shows that a transition over an edge labelled by an assumption is 
defined if the boolean expression of the assumption evaluates to true. The second and third 
rules show the updates of the states caused by the assignment. Finally, the fourth rule 
describes the running of the scheduler. 

Definition 3.2 (Computation Sequence, Run, Reachable Configuration). A computation 
sequence 70,71,... of a threaded program P is either a finite or an infinite sequence of 

op 

configurations of P such that, for all i, either 7^ — > 7j + i for some operation op or 7^ — > 7^+1. 
A run of a threaded program P is a computation sequence 70,71, • • • such that 70 is an 
initial configuration. A configuration 7 of P is reachable from a configuration 7' if there 
is a computation sequence 70, • • • ,7n such that 70 = 7' and j n = 7. A configuration 7 is 
reachable in P if it is reachable from an initial configuration. 

A configuration (7^ , . . . , (I, s)j, . . . , 7Tjv, 9 s , §>} of a threaded program P is an error 
configuration if CFG = (L, E,Iq, L err ) and I £ L err . We say a threaded program P is 
safe iff no error configuration is reachable in P; otherwise, P is unsafe. 

4. Explicit-Scheduler Symbolic- Thread (ESST) 

In this section we present our novel technique for verifying threaded programs. We call 
our technique Explicit- Scheduler Symbolic- Thread (ESST) [CMNR10J. This technique is a 
CEGAR based technique that combines explicit-state techniques with the lazy predicate 
abstraction described in Section 12.31 In the same way as the lazy predicate abstraction, 
ESST analyzes the data path of the threads by means of predicate abstraction and ana- 
lyzes the flow of control of each thread with explicit-state techniques. Additionally, ESST 
includes the scheduler as part of its model checking algorithm and analyzes the state of the 
scheduler with explicit-state techniques. 
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4.1. Abstract Reachability Forest (ARF). The ESST technique is based on the on- 
the-fly construction and analysis of an abstract reachability forest (ARF). An ARF de- 
scribes the reachable abstract states of the threaded program. It consists of connected 
abstract reachability trees (ARTs), each describing the reachable abstract states of the run- 
ning thread. The connections between one ART with the others in an ARF describe possible 
thread interleavings from the currently running thread to the next running thread. 

Let P be a threaded program with N threads T± , . . . , Tjy . A thread region for the thread 
Ti, for 1 < i < N, is a set of thread configurations such that the domain of the states of the 
configurations is LVar^ U GVar. A global region for a threaded program P is a set of states 
whose domain is \J i=1 N LVar^ U GVar. 

Definition 4.1 (ARF Node). An ARF node for a threaded program P with N threads 
T\ , . . . , Tjv is a tuple 

((Zi,<pi),...,(ZjV,<£>Ar), <£>,§), 
where (k, ipi), for i = 1, . . . , N, is a thread region for T; L , ip is a global region, and S is the 
scheduler state. 

Note that, by definition, the global region, along with the program locations and the 
scheduler state, is sufficient for representing the abstract state of a threaded program. 
However, such a representation will incur some inefficiencies in computing the predicate 
abstraction. That is, without any thread regions, the precision is only associated with 
the global region. Such a precision will undoubtedly contains a lot of predicates about the 
variables occurring in the threaded program. However, when we are interested in computing 
an abstraction of a thread region, we often do not need the predicates consisting only of 
variables that are local to some other threads. 

In ESST we can associate a precision with a location Zj of the CFG Gt for thread T, 
denoted by ir^, with a thread T, denoted by ttt, or the global region ip, denoted by w. For a 
precision ttt and for every location I of Gt, we have ttt Q tti for the precision 717 associated 
with the location I. Given a predicate ip and a location / of the CFG G^, and let fvar(ip) 
be the set of free variables of ip, we can add ip into the following precisions: 

• If fvar(ip) C LVarTi, then ip can be added into tt, w^, or 717. 

• If fvar(ip) C LVar^ U GVar, then ip can be added into 7r, tt^, or 717. 

• If fvar(ip) C Uj=i n LVar^ U GVar, then ip can be added into ir. 

4.2. Primitive Executor and Scheduler. As indicated by the operational semantics of 
threaded programs, besides computing abstract post-conditions, we need to execute calls 
to primitive functions and to explore all possible schedules (or interleavings) during the 
construction of an ARF. For the calls to primitive functions, we assume that the values 
passed as arguments to the primitive functions are known statically. This is a limitation of 
the current ESST algorithm, and we will address this limitation in our future work. 

Recall that, SState denotes the set of scheduler states, and let PrimitiveCall be the set 
of calls to primitive functions. To implement the semantic function [expj^-, where exp is a 
primitive function call, we introduce the function 

Sexec : (SState x PrimitiveCall) ->(Zx SState). 

This function takes as inputs a scheduler state, a call f(x) to a primitive function /, and 
returns a value and an updated scheduler state resulting from the execution of / on the 
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arguments x. That is, Sexec(8, /(x)) essentially computes {f(x)}g(-, ■, §)• Since we assume 
that the values of x are known statically, we deliberately ignore, by the states of local 
and global variables. 

Example 4.2. Let us consider a primitive function call wait_event (e) that suspends a 
running thread T and makes the thread wait for a notification of an event e. Let evx be 
the variable in the scheduler state that keeps track of the event whose notification is waited 
for by T. The state §' of (•,§') = Sexec(§, wait_event (e)) is obtained from the state S by 
changing the status of running thread to Waiting , and noting that the thread is waiting for 
event e, that is, §' = S[st Waiting, cvt i— > e]. 

Finally, to implement the scheduler function Sched in the operational semantics, and 
to explore all possible schedules, we introduce the function 



This function takes as an input a scheduler state and returns a set of scheduler states that 
represent all possible schedules. 

4.3. ARF Construction. We expand an ARF node by unwinding the CFG of the running 
thread and by running the scheduler. Given an ARF node 

((h,<Pi),---,{ l N,<PN},<P,$), 
we expand the node by the following rules ( 'M. XR l()j : 

El. If there is a running thread Ti in § such that the thread performs an operation op and 
(li,op,l'j) is an edge of the CFG Gt % of thread Tj, then we have two cases: 
• If op is not a call to primitive function, then the successor node is 



if op possibly updates global variables, otherwise ip'j = tpj, and 
(iii) ip' = SPg p ((p) and it is the precision associated with the global region. 
The function havoc collects all global variables possibly updated by op, and builds 
a new operation where these variables are assigned with fresh variables. The edge 
connecting the original node and the resulting successor node is labelled by the 
operation op. 

• If op is a primitive function call x := f(y), then the successor node is 



Sched : S State No -> V (S State one)- 



({h,(p[), • • • , (/•, y^), ... , (Zjv,<£jv)> 



where 



(i) 

(ii) </>' 




((/i, (p'x), . . . , (/■, (/?•), . . . , (l N , (p' N ), (p',S'), 



where 




if op possibly updates global variables, otherwise (p'j = tpj, and 
(v) ip' = SP*i(<p) and ir is the precision associated with the global region. 
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The edge connecting the original node and the resulting successor node is labelled 
by the operation op' . 

E2. If there is no running thread in §, then, for each §' £ Sched(S), we create a successor 
node 

((h,^),. . . ,(l N ,ip N ),ip,S'). 

We call such a connection between two nodes an ARF connector. 
Note that, the rule IE1I constructs the ART that belongs to the running thread, while the 
connections between the ARTs that are established by ARF connectors in the rule IE2I 
represent possible thread inter leavings or context switches. 

An ARF node ((li, <pi), ■ ■ ■ , (In, <Pn)i ^P, §0 is the initial node if for all i = 1, . . . , N, the 
location li is the entry location of the CFG of thread T{ and (pi is true, <p is true, and 
S(s ma j n ) = Running and §(sjv) ^ Running for all Tj ^ main. 

We construct an ARF by applying the rules IE1I and IE2I starting from the initial node. 
A node can be expanded if the node is not covered by other nodes and if the conjunction 
of all its thread regions and the global region is satisfiable. 

Definition 4.3 (Node Coverage). An ARF node ((li, (pi), ... ,{1^, pn), <P, §) is covered by 
another ARF node ((l[, (p[), . . . , (l' N , <p' N ), <p', §') if k = l[ for i = 1, . . . , N, § = $', and 
p =>• p' and Ai=i,...,Ar(^i <p'i) are valid. 

An ARF is complete if it is closed under the expansion of rules IE1I and IE2I An ARF 
node . . . , (In, <Pn), <P, *§) is an error node if p A Ai=i n Pi i s satisfiable, and at 

least one of the locations li, . . . , In is an error location. An ARF is safe if it is complete 
and does not contain any error node. 

4.4. Counter-example Analysis. Similar to the lazy predicate abstraction for sequential 
programs, during the construction of an ARF, when we reach an error node, we check if 
the path in the ARF from the initial node to the error node is feasible. 

Definition 4.4 (ARF Path). An ARF path p = pi, Ki, pi, . . . , n n -\, Pn is a finite sequence 
of ART paths pi connected by ARF connectors Kj, such that 

(1) pi, for i = 1, . . . , n, is an ART path, 

(2) Kj, for j = 1, . . . , n — 1, is an ARF connector, and 

(3) for every j = 1, ... ,n — 1, such that pj = e{,...,e J m and pj + i = ■ ■ ■ , £j + , the 
target node of e(n is the source node of kj and the source node of e{ +1 is the target 
node of Kj . 

A suppressed ARF path sup(p) of p is the sequence p\,...,p n . 

A counter-example path p is an ARF path such that the source node of Ei of pi = 
£i, . . . , e m is the initial node, and the target node of e' k of p n = e[, . . . , e' k is an error node. 
Let cr sup (p-) denote the sequence of operations labelling the edges in sup(p). We say that a 
counter-example path p is feasible if and only if SP asup( ^^(true) is satisfiable. Similar to the 
case of sequential programs, one can check the feasibility of p by checking the satisfiability 
of the path formula corresponding to the SSA form of cr sup (py 

Example 4.5. Suppose that the top path in Figure[5]is a counter-example path (the target 
node of the last edge is an error node). The bottom path is the suppressed version of the 
top one. The dashed edge is an ARF connector. To check feasibility of the path by means of 
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> ► *~ *- 

x := x+y y := 7 x := z [ x < y+z ] 

Suppressed 

*- *- *- *■ 

x := x+y y := 7 x := z [ x < y+z ] 

Figure 5: An example of a counter-example path. 

satisfiability of the corresponding path formula, we check the satisfiability of the following 
formula: 

xl = xO + yO A yl = 7 A x2 = zO A x2 < yl + zO. 

4.5. ARF Refinement. When the counter-example path p is infeasible, we need to rule 
out such a path by refining the precision of nodes in the ARF. ARF refinement amounts to 
finding additional predicates to refine the precisions. Similar to the case of sequential pro- 
grams, these additional predicates can be extracted from the path formula corresponding to 
sequence cr sup (p) by using the Craig interpolant refinement method described in Section I2J31 

As described in Section 14.11 newly discovered predicates can be added to precisions 
associated to locations, threads, or the global region. Consider again the Craig interpolant 
method in Section 12.31 Let e\ , . . . , e n be the sequence of edges labelled by the operations 
opi, . . . , op n of cr sup ^, that is, for i = 1, . . . , n, the edge is labelled by opi. Let p be a 
predicate extracted from the interpolant ip k of (p 1,k , p k+1,n ) for 1 < k < n, and let the 
nodes 

((h,Pi), ■ ■■ , (h, Pi), ■ ■■ , (In, <Pn),V, §) 

and 

((h,p'i), (l'i,<Pi), (In,^'n)^'^') 
be, respectively, the source and target nodes of the edge such that the running thread 
in the source node's scheduler state is the thread T{. If p contains only variables local to 
Tj, then we can add p to the precision associated with the location to the the precision 
associated with Tj, or to the precision associated with the global region. Other precisions 
refinement strategies are applicable. For example, one might add a predicate into the 
precision associated with the global region if and only if the predicate contains variables 
local to several threads. 

Similar to the ART refinement in the case of sequential programs, once the precisions 
are refined, we refine the ARF by removing the infeasible counter-example path or by 
removing part of the ARF that contains the infeasible path, and then reconstruct again the 
ARF using the refined precisions. 

4.6. Havocked Operations. Computing the abstract strongest post-conditions with re- 
spect to the havocked operation in the rule IE1I is necessary, not only to keep the regions of 
the ARF node consistent, but, more importantly, to maintain soundness: never reports safe 
for an unsafe case. Suppose that the region of a non-running thread T is the formula x = g, 
where x is a variable local to T and g is a shared global variable. Suppose further that the 
global region is true. If the running thread T' updates the value of g with, for example, 
the assignment g := w, for some variable w local to T' , then the region x = g of T might 
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no longer hold, and has to be invalidated. Otherwise, when T resumes, and, for example, 
checks for an assertion assert (x — g), then no assertion violation can occur. One way to 
keep the region of T consistent is to update the region using the HAVOC(<? := w) operation, 
as shown in the rule IE 11 That is, we compute the successor region of T as SP^ =a (x = g), 
where a is a fresh variable and I is the current location of T. The fresh variable a essentially 
denotes an arbitrary value that is assigned to g. 

Note that, by using a HAVOC(op) operation, we do not leak variables local to the running 
thread when we update the regions of non-running threads. Unfortunately, the use of 
HAVOC(op) can cause loss of precision. One way to address this issue is to add predicates 
containing local and global variables to the precision associated with the global region. An 
alternative approach, as described in [DKKW11], is to simply use the operation op (leaking 
the local variables) when updating the regions of non-running threads. 



4.7. Summary of ESST. The ESST algorithm takes a threaded program P as an input 
and, when its execution terminates, returns either a feasible counter-example path and 
reports that P is unsafe, or a safe ARF and reports that P is safe. The execution of 
ESST(P) can be illustrated in Figured! 

(1) Start with an ARF consisting only of the initial node, as shown in Figure (6]^a). 

(2) Pick an ARF node that can be expanded and apply the rules IEl lor lE2l to grow the ARF, 
as shown in Figures [6]^b) and[6|c). The different colors denote the different threads to 
which the ARTs belong. 

(3) If we reach an error node, as shown by the red line in Figure E^d), we analyze the 
counter-example path. 

(a) If the path is feasible, then report that P is unsafe. 

(b) If the path is spurious, then refine the ARF: 

(i) Discover new predicates to refine abstractions. 

(ii) Undo part of the ARF, as shown in Figure 0(e) . 

(iii) Goto (2) to reconstruct the ARF. 

(4) If the ARF is safe, as shown in Figure [6jf ) , then report that P is safe. 



(si U s 2 )(x) 



4.8. Correctness of ESST. To prove the correctness of ESST, we need to introduce 
several notions and notations that relate the ESST algorithm with the operational semantics 
in Section [3l Given two states s\ and s 2 whose domains are disjoint, we denote by si U s 2 
the union of two states such that Dom{s\ U s 2 ) is Dom{si) U Dom(s 2 ), and, for every 
x € Dom{s\ U s 2 ), we have 

si(x) if x € Dom(si); 
s 2 (x) otherwise. 

Let P be a threaded program with N threads, and 7 be a configuration 

((h,si), (l N ,s N ),gs,E), 

of P. Let rj be an ARF node 

ipx) , . . . , (l' N , if N ), ip,§'), 

for P. We say that the configuration 7 satisfies the ARF node rj, denoted by 7 |= ry if and 
only if for all i = 1, . . . , JV, we have k = l\ and SiU gs \= <fi, (J i= i n s i ^ 9 s N V 9 ) an d 
§ = §'. 
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Figure 6: ARF construction in ESST. 



By the above definition, it is easy to see that, for any initial configuration 70 of P, we 
have 70 |= r/o f° r the initial ARF node ijq. In the sequel we refer to the configurations of 
P and the ARF nodes (or connectors) for P when we speak about configurations and ARF 
nodes (or connectors), respectively. 

We now show that the node expansion rules IE1I and IE2I create successor nodes that are 
over-approximations of the configurations reachable by performing operations considered in 
the rules. 

Lemma 4.6. Let rj and r/' be ARF nodes for a threaded program P such that rf is a successor 
node of j]. Let 7 be a configuration of P such that 7 \= rj. The following properties hold: 
(1) Lf rf is obtained from r] by the rule \E1\ with the performed operation op, then, for any 
configuration 7' of P such that 7 7', we have 7' |= rf . 
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(2) If rf is obtained from rj by the rule \E2\ then, for any configuration 7' of P such that 
7 A 7' and the scheduler states of rj' and 7' coincide, we have 7' (= 7/. 

Let e be an ART edge with source node 

V = ((h,<Pi), ■■■,(k,(pi),... {In, Vn},¥, §) 

and target node 

V' = ((l 1 ,v[),...,(l' l ,ti),...(l N ,<p N ),<p',§'), 
such that §(stJ = Running and for all j ^ i, we have S(sy.) 7^ Running. Let = 
(L, E,Iq, L err ) be the CFG for Tj such that (k,op, l[) G P. Let 7 and 7' be configurations. 
We denote by 7 A 7' if 7 |= 77, 7' |= 7/, and 7 -I 7'. Note that, the operation op is the 
operation labelling the edge of CFG, not the one labelling the ART edge e. Similarly, we 
denote by 7 A 7' for an ARF connector k if 7 |= 77, 7' |= 7/, and 7 — > 7'. Let p = £1, . . . , £ m 
be an ARF path. That is, for each % = 1, . . . , m, the element £j is either an ART edge or an 

ARF connector. We denote by 7 A 7' if there exists a computation sequence 71, ... , 7 m +i 

such that 7j -4 7^+1 for all i = 1, . . . , m, and 7 = 71 and 7' = jm+i- 

In Section the notion of strongest post-condition is defined as a set of reachable states 
after executing some operation. We now try to relate the notion of configuration with the 
notion of strongest post-condition. Let 7 be a configuration 

7 = ((h,si), ... , (h,Si), (l N ,s N ),gs,S), 
and tp be a formula whose free variables range over Ufc=i n Dom(sk) U Dom(gs). We say 
that the configuration satisfies the formula ip, denoted by 7 |= <p if (Jfc=i n s k ^ 9 s \= L P- 
Suppose that in the above configuration 7 we have S(sj\) = Running and S(s^ ) 7^ Running 
for all j 7^ 7. Let Gt { = (L, E,lo, L err ) be the CFG for Tj such that (k,op, l^) £ E. Let op 
be op if op does not contain any primitive function call, otherwise dp be op' as in the second 
case of the expansion rule IE11 Then, for any configuration 

1 = ((^i,si),..., (/-,£■),..., (/at, SA0,gs',§'}, 

such that 7 -A 7', we have 7' |= SPop{^>). Note that, the scheduler states § and S' are not 
constrained by, respectively, ip and SP ~ p (ip), and so they can be different. 

When ESST(P) terminates and reports that P is safe, we require that, for every 
configuration 7 reachable in P, there is a node in T such that the configuration satisfies the 
node. We denote by Reach(P) the set of configurations reachable in P, and by Nodes (J 7 ) 
the set of nodes in T. 

Theorem 4.7 (Correctness). Let P be a threaded program. For every terminating execution 
o/ESST(P) ; we have the following properties: 

(1) If ESST(P) returns a feasible counter-example path p, then we have 7 A 7' for an 
initial configuration 7 and an error configuration 7' of P. 

(2) If ESST(P) returns a safe ARF T, then for every configuration 7 G Reach(P), there 
is an ARF node 7/ E Nodes (J 7 ) such that 7 (= 77. 
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5. ESST + Partial-Order Reduction 

The ESST algorithm often has to explore a large number of possible thread interleavings. 
However, some of them might be redundant because the order of interleavings of some 
threads is irrelevant. Given N threads such that each of them accesses a disjoint set of 
variables, there are N\ possible interleavings that ESST has to explore. The constructed 
ARF will consists of 2^ abstract states (or nodes). Unfortunately, the more abstract states 
to explore, the more computations of abstract strongest post-conditions are needed, and 
the more coverage checks are involved. Moreover, the more interleavings to explore, the 
more possible spurious counter-example paths to rule out, and thus the more refinements are 
needed. As refinements result in keeping track of additional predicates, the computations of 
abstract strongest post-conditions become expensive. Consequently, exploring all possible 
interleavings degrades the performance of ESST and leads to state explosion. 

Partial-order reduction techniques (POR) jGod96[ IPel931 [Val91| have been successfully 
applied in explicit-state software model checkers like SPIN [Hol05] and VeriSoft [God05 
to avoid exploring redundant interleavings. POR has also been applied to symbolic model 
checking techniques as shown in [KGS06UWYKG08llABH + 0lj . In this section we will extend 
the ESST algorithm with POR techniques. However, as we will see, such an integration 
is not trivial because we need to ensure that in the construction of the ARF the POR 
techniques do not make ESST unsound. 

5.1. Partial-Order Reduction (POR). Partial-order reduction (POR) is a model check- 
ing technique that is aimed at combating the state explosion by exploring only represen- 
tative subset of all possible interleavings. POR exploits the commutativity of concurrent 
transitions that result in the same state when they are executed in different orders. 

We present POR using the standard notions and notations used in [God96, CGP99 . 
We model a concurrent program as a transition system M = (S, Sq,T), where S is the finite 
set of states, So C S is the set of initial states, and T is a set of transitions such that for 
each a € T, we have a C S x S. We say that a(s,s') holds and often write it as s A s' 
if (s, s') & a. A state s' is a successor of a state s if s A s' for some transition a E T. In 
the following we will only consider deterministic transitions, and often write s' = a(s) for 
a(s, s'). A transition a is enabled in a state s if there is a state s' such that a(s, s 1 ) holds. 
The set of transitions enabled in a state s is denoted by enabled(s). A path from a state s 
in a transition system is a finite or infinite sequence sq ^ s% — ^ • • • such that s = sq and 
Si -4 for all i. A path is empty if the sequence consists only of a single state. The 
length of a finite path is the number of transitions in the path. 

Let M = (S,Sq,T) be a transition system, we denote by Reach(So,T) C S the set of 
states reachable from the states in Sq by the transitions in T: for a state s G Reach(So,T), 
there is a finite path sq ^ . . . s n system such that so £ 5o and s = s n . In this work we 
are interested in verifying safety properties in the form of program assertion. To this end, 
we assume that there is a set T err C T of error transitions such that the set 

EM,T err = {s e S \ 3s' e S.3a G T err . a(s',s) holds } 

is the set of error states of M with respect to T err . A transition system M = (S,Sq,,T) is 
safe with respect to the set T err C T of error transitions iff Reach(So,T) n EM,T err = 0- 
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Selective search in POR exploits the commutativity of concurrent transitions. The 
concept of commutativity of concurrent transitions can be formulated by defining an inde- 
pendence relation on pairs of transitions. 

Definition 5.1 (Independence Relation, Independent Transitions). An independence rela- 
tion I C T x T is a symmetric, anti-reflexive relation such that for each state s G S and for 
each (a, (3) £ I the following conditions are satisfied: 

Enabledness: If a is in enabled(s), then f3 is in enabled(s) iff (3 is in enabled (a(s)). 
Commutativity: If a and (3 are in enabled(s), then q(/3(s)) = (3(a(s)). 
We say that two transitions a and {3 are independent of each other if for every state s they 
satisfy the enabledness and commutativity conditions. We also say that two transitions 
a and (3 are independent in a state s of each other if they satisfy the enabledness and 
commutativity conditions in s. 

In the sequel we will use the notion of valid dependence relation to select a representative 
subset of transitions that need to be explored. 

Definition 5.2 (Valid Dependence Relation). A valid dependence relation DCTxTis 
a symmetric, reflexive relation such that for every (a, (3) ^ D, the transitions a and (3 are 
independent of each other. 

5.1.1. The Persistent Set Approach. To reduce the number of possible interleavings, in every 
state visited during the state space exploration one only explores a representative subset 
of transitions that are enabled in that state. However, to select such a subset we have to 
avoid possible dependencies that can happen in the future. To this end, we appeal to the 
notion of persistent set [God96| . 

Definition 5.3 (Persistent Set). A set P C T of enabled transitions in a state s is persistent 
in s if for every finite non-empty path s = sq ^ si — \- ■ ■ ■ Q ^-> 1 s n ^ s n+ i such that a» P 
for all i = 0, . . . , n, we have a n independent of any transition in P in s n . 

Note that the persistent set in a state is not unique. To guarantee the existence of 
successor state, we impose the success or- state condition on the persistent set: the persistent 
set in s is empty iff so is enabled(s). In the sequel we assume persistent sets satisfy the 
successor-state condition. We say that a state s is fully expanded if the persistent set in s 
equals enabled(s). It is easy to see that, for any transition a not in the persistent set P in 
a state s, the transition a is disabled in s or independent of any transition in P. 

We denote by Reach re( i(So,T) C S the set of states reachable from the states in So 
by the transitions in T such that, during the state space exploration, in every visited state 
we only explore the transitions in the persistent set in that state. That is, for a state 
s G Reach re d(So,T), there is a finite path sq ■ ■ ■ ^ s n in the transition system such 
that so G So and s = s n , and Qj is in the persistent set of Sj, for i = 0, . . . , n — 1. It is easy 
to see that Reach re d(So,T) C Reach(So,T). 

To preserve safety properties of a transition system, we need to guarantee that the 
reduction by means of persistent sets does not remove all interleavings that lead to an 
error state. To this end, we impose the cycle condition on Reach re d(So,T) [CGP991 IPel93] : 
a cycle is not allowed if it contains a state in which a transition a is enabled, but a is 
never included in the persistent set of any state s on the cycle. That is, if there is a cycle 



20 



A. CIMATTI, I. NARASAMDYA, AND M. ROVERI 



So • • ■ — > X Sn = so induced by the states sq, . . . , s n _i in Reach re d(So, T) such that Oj is 
persistent in Sj, for i = 0, . . . , n — 1 and a € enabled(sj) for some < j < n, then a must 
be in the persistent set of any of so, ... , s n _i. 

Theorem 5.4. ^4 transition system M = (S,So,T) is safe w.r.t. a set T err Q T of error 
transitions iff Reach re d(So,T) that satisfies the cycle condition does not contain any error 
state from Em, r err ■ 

5.1.2. The Sleep Set Approach. The sleep set POR technique exploits independencies of 
enabled transitions in the current state. For example, suppose that in some state s there 
are two enabled transitions a and /3, and they are independent of each other. Suppose 
further that the search explores a first from s. Then, when the search explores /3 from 

s such that s A s' for some state s', we associate with s' a sleep set containing only a. 
From s' the search only explores transitions that are not in the sleep set of s' . That is, 
although the transition a is still enabled in s' , it will not be explored. Both persistent 
set and sleep set techniques are orthogonal and complementary, and thus can be applied 
simultaneously. Note that the sleep set technique only removes transitions, and not states. 
Thus, Theorem 15.41 still holds when the sleep set technique is applied. 

5.2. Applying POR to ESST. The key idea of applying POR to ESST is to select a 
representative subset of scheduler states output by the scheduler in ESST. That is, instead 
of creating successor nodes with all scheduler states from {§i, . . . , S n } = Sched(S), for some 
state S, we create successor nodes with the representative subset of {Si, . . . , S n }. However, 
such an application is non-trivial. The ESST algorithm is based on the construction of 
an ARF that describes the reachable abstract states, while the exposition of POR before 
is based on the analysis of reachable concrete states. As we will see later, some POR 
properties that hold in the concrete state space do not hold in the abstract state space. 
Nevertheless, in applying POR to ESST one needs to guarantee that the original ARF is 
safe if and only if the reduced ARF, obtained by the restriction on the scheduler's output, 
is safe. In particular, the construction of reduced ARF has to check if the cycle condition 
is satisfied in its concretization. 

To integrate POR techniques into the ESST algorithm, we first need to identify frag- 
ments in the threaded program that count as transitions in the transition system. In the 
previous description of POR the execution of a transition is atomic, that is, its execution 
cannot be interleaved by the executions of other transitions. We introduce the notion of 
atomic block as the notion of transition in the threaded program. Intuitively, an atomic 
block is a block of operations between calls to primitive functions that can suspend the 
thread. Let us call such primitive functions blocking functions. 

An atomic block of a thread is a rooted subgraph of the CFG such that the subgraph 
satisfies the following conditions: 

(1) its unique entry is the entry of the CFG or the location that immediately follows a call 
to a blocking function; 

(2) its exit is the exit of the CFG or the location that immediately follows a call to a 
blocking function; and 

(3) there is no call to a blocking function in any CFG path from the entry to an exit except 
the one that precedes the exit. 
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Figure 7: Identifying atomic blocks. 



Note that an atomic block has a unique entry, but can have multiple exits. We often identify 
an atomic block by its entry. Furthermore, we denote by ABlock the set of atomic blocks. 

Example 5.5. Consider a thread whose CFG is depicted in Figure[7{a). Let wait(. . .) be 
the only call to a blocking function in the CFG. Figures [7](b) and (c) depicts the atomic 
blocks of the thread. The atomic block in Figure E^b) starts from Iq and exits at Z5 and Z7, 
while the one in Figure W[c) starts from Z5 and exits at I5 and lj. 

Note that, an atomic block can span over multiple basic blocks or even multiple large 
blocks in the basic block or large block encoding [BCG + 09] . In the sequel we will use the 
terms transition and atomic block interchangeably. 

Prior to computing persistent sets, we need to compute valid dependence relations. 
The criteria for two transitions being dependent are different from one application domain 
to the other. Cooperative threads in many embedded system domains employ event-based 
synchronizations through event waits and notifications. Different domains can have different 
types of event notification. For generality, we anticipate two kinds of notification: immediate 
and delayed notifications. An immediate notification is materialized immediately at the 
current time or at the current cycle (for cycle-based semantics). Threads that are waiting 
for the notified events are made runnable upon the notification. A delayed notification is 
scheduled to be materialized at some future time or at the end of the current cycle. In some 
domains delayed notifications can be cancelled before they are triggered. 

For example, in a system design language that supports event-based synchronization, a 
pair (a, (5) of atomic blocks are in a valid dependence relation if one of the following criteria 
is satisfied: (1) the atomic block a contains a write to a shared (or global) variable g, and 
the atomic block f3 contains a write or a read to g; (2) the atomic block a contains an 
immediate notification of an event e, and the atomic block (3 contains a wait for e; (3) the 
atomic block a contains a delayed notification of an event e, and the atomic block f3 contains 
a cancellation of a notification of e. Note that the first criterion is a standard criterion for 
two blocks to become dependent on each other. That is, the order of executions of the 
two blocks is relevant because different orders yield different values assigned to variables. 
The second and the third criteria are specific to event-based synchronization language. An 
event notification can make runnable a thread that is waiting for a notification of the event. 
A waiting thread misses an event notification if the thread waited for such a notification 
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Algorithm 1 Persistent sets. 

Input: a set B en of enabled atomic blocks. 
Output: a persistent set P. 

(1) Let B := {a}, where a £ B en . 

(2) For each atomic block a € B: 

(a) If a G B en (a is enabled): 

• Add into B every atomic block f3 such that (a, f3) € D. 

(b) If a £ B en (a is disabled): 

• Add into B a necessary enabling set for a with B en . 

(3) Repeat step 2 until no more atomic blocks can be added into B. 

(4) P:=BD B en . 



after another thread had made the notification. Thus, the order of executions of atomic 
blocks containing event waits and event notifications is relevant. Similarly for the delayed 
notification in the third criterion. Given criteria for being dependent, one can use static 
analysis techniques to compute a valid dependence relation. 

To have small persistent sets, we need to know whether a disabled transition that has 
a dependence relation with the currently enabled ones can be made enabled in the future. 
To this end, we use the notion of necessary enabling set introduced in [God96j. 

Definition 5.6 (Necessary Enabling Set). Let M = (S,Sq,T) be a transition system such 
that a transition a € T is diabled in a state s € S. A set T as C T is a necessary enabling 

set for a in s if for every finite path s — sq —I • • • — y s n in M such that a is disabled in 
Si, for all < i < n, but is enabled in s n , a transition tj, for some < j < n — 1, is in 
T ajS . A set T aj T e „ Q T, for T en C T, is a necessary enabling set for a with T en if T aj T e „ is a 
necessary enabling set for a in every state s such that T en is the set of enabled transitions 
in s. 

Intuitively, a necessary enabling set T a ^ s for a transition a in a state s is a set of transitions 
such that a cannot become enabled in the future before at least a transition in T a ^ s is 
executed. 

Algorithm [T] computes persistent sets using a valid dependence relation D. It is easy to 
see that the persistent set computed by the algorithm satisfies the successor-state condition. 
The algorithm is also a variant of the stubborn set algorithm presented in [God96j . that is, 
we use a valid dependence relation as the interference relation used in the latter algorithm. 

We apply POR to the ESST algorithm by modifying the ARF node expansion rule lE2l 
described in Section [5] in two steps. First we compute a persistent set from a set of scheduler 
states output by the function Sched. Second, we ensure that the cycle condition is satisfied 
by the concretization of the constructed ARF. 

We introduce the function Persistent that computes a persistent set of a set of sched- 
uler states. Persistent takes as inputs an ARF node and a set S of scheduler states, and 
outputs a subset S' of S. The input ARF node keeps track of the thread locations, which 
are used to identify atomic blocks, while the input scheduler states keep track of the status 
of the threads. From the ARF node and the set S, the function Persistent extracts the 
set B en of enabled atomic blocks. Persistent then computes a persistent set P from B en 
using Algorithm [TJ Finally, PERSISTENT constructs back a subset S' of the input set S of 
scheduler states from the persistent set P. 
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Algorithm 2 ARF expansion algorithm for non-running node. 

Input: a non-running ARF node 77 that contains no error locations. 

(1) Let NonRunning(ARFPath(r], J 7 )) be 770, ... , rj m such that 77 = r/ m 

(2) If there exists i < m such that r/j covers 77: 

(a) Let 7? m _i = ((r i ,^ / 1 },...,(/^,< / 9 / Ar ),v3 / ,S / )- 

(b) If Persistent (7? m _i, Sched(S')) c Sched(S'): 

• For all 8" € Sched(S') \ Persistent^^i, Sched(S')): 

— Create a new ART with root node ({l[, cp^), . . . , (l' N , <p' N ), <p',§"). 

(3) If 77 is covered: Mark 7/ as covered. 

(4) If 77 is not covered: Expand 7/ by rule IE2f . 



Let 77 = 991), . . . , (/at, ^)jv), ip, S) be an ARF node that is going to be expanded. We 
replace the rule IE2I in the following way: instead of creating a new ART for each state 
S' £ Sched(S), we create a new ART whose root is the node ({h,tpi}, • • • , (lN,(pjf},tp,§') 
for each state S' G Persistent^, Sched(S)) (rule lE2l). 

To guarantee the preservation of safety properties, we have to check that the cycle 
condition is satisfied. Following [CGP99], we check a stronger condition: at least one state 
along the cycle is fully expanded. In the ESST algorithm a potential cycle occurs if an ARF 
node is covered by one of its predecessors in the ARF. Let 7/ = y?i), . . . , {In,(Pn), <p, §) 
be an ARF node. We say that the scheduler state § is running if there is a running thread 
in S. We also say that the node 77 is running if its scheduler state § is. Note that during 
ARF expansion the input of Sched is always a non-running scheduler state. A path in an 
ARF can be represented as a sequence 770, • • • , r] m of ARF nodes such that for all i, we have 
rji+x is a successor of rji in the same ART or there is an ARF connector from m to r/i + \. 
Given an ARF node 77 of ARF J 7 , we denote by ARFPath(n, J-) the ARF path 770, • • • , 77 m 
such that 7/0 has neither a predecessor ARF node nor an incoming ARF connector, and 
f]m = V- Let p be an ARF path, we denote by NonRunning (p) the maximal subsequence of 
non-running node in p. 

Algorithm [2] shows how a non-running ARF node 77 is expanded in the presence of 
POR. We assume that 77 is not an error node. The algorithm fully expands the immediate 
non-running predecessor node of 77 when a potential cycle is detected. Otherwise the node 
is expanded as usual. 

Our POR technique slightly differs from that of [CGP99J. On computing the successor 
states of a state s, the technique in [CGP99] tries to compute a persistent set P in s that 
does not create a cycle. That is, particularly for the depth-first search (DFS) exploration, 
for every a in P, the successor state a(s) is not in the DFS stack. If it does not succeed, then 
it fully expands the state. Because the technique in |CGP99j is applied to the explicit-state 
model checking, computing the successor state a(s) is cheap. 

In our context, to detect a cycle, one has to expand an ARF node by a transition (or 
an atomic block) that can span over multiple operations in the CFG, and thus may require 
multiple applications of the rule lEll As the rule involves expensive computations of abstract 
strongest post-conditions, detecting a cycle using the technique in |CGP99| is bound to be 
expensive. 

In addition to coverage check, in the above algorithm one can also check if the detected 
cycle is spurious. We only fully expand a node iff the detected cycle is not spurious. When 
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Algorithm 3 Sleep sets. 
Input: 

• a set B en of enabled atomic blocks. 

• a sleep set Z. 
Output: 

• a reduced set B re d C B en of enabled atomic blocks. 

• a mapping M z ■ B re d — > V(ABlock) 

(1) -Bred := B en \ Z. 

(2) For all a <E -B re rf: 

(a) For all (3 G Z: 

• If (a,/3) D (a and (3 are independent): M z [a] := M z [a] U {/?}. 

(b) Z := ZU{a}. 



cycles are rare, the benefit of POR can be defeated by the price of generating and solving 
the constraints representing the cycle. 

POR based on sleep sets can also be applied to ESST. First, we extend the node of 
ARF to include a sleep set. That is, an ARF node is a tuple ({h,<pi), ■ ■ ■ , {In, 1 Pn)-,^Pi §>> Z), 
where the sleep set Z is a set of atomic blocks. The sleep set is ignored during coverage 
check. Second, from the set of enabled atomic blocks and the sleep set of the current node, 
we compute a subset of enabled atomic blocks and a mapping from every atomic block in 
the former subset to a successor sleep set. 

Let D be a valid dependence relation, Algorithm [3] shows how to compute a reduced set 
of enabled transitions B re ^ and a mapping Mz to successor sleep sets using D. The input 
of the algorithm is a set B en of enabled atomic blocks and the sleep set Z of the current 
node. Note that the set B en can be a persistent set obtained by Algorithm [TJ 

Similar to the persistent set technique, we introduce the function Sleep that takes as 
inputs an ARF node r\ and a set of scheduler states S, and outputs a subset S' of S along 
with the above mapping Mz- From the ARF node and the scheduler states, Sleep extracts 
the set B en of enabled atomic blocks and the current sleep set. Sleep then computes 
a subset B re d of B en of enabled atomic blocks and the mapping Mz using Algorithm [3j 
Finally, Sleep constructs back a subset S' of the input set S of scheduler states from the 
set B re d of enabled atomic blocks. 

Let rj = ({h, (fx ),..., (Zjv, Pn), <P, §, Z) be an ARF node that is going to be expanded. 
We replace the rule [E2] in the following way: let (S',M Z ) = Sleep(t/, Sched(S)), create a 
new ART whose root is the node ((/i, ipi ),..., (In, ^n), f, Mz[l']) for each S' € S' such 
that I' is the atomic block of the running thread in §' (rule IE2P ) . 

One can easily combine persistent and sleep sets by replacing the above computation 
(S',M Z ) = Sleep(t/, Sched(S)) by (S',M Z ) = Sleep(t/, Persistent(t/, Sched(S))). 

5.3. Correctness of ESST + POR. The correctness of POR with respect to verifying 
program assertions in transition systems has been shown in Theorem 15.41 The correctness 
proof relies on the enabledness and commutativity of independent transitions. However, 
the proof is applied in the concrete state space of the transition system, while the ESST 
algorithm works in the abstract state space represented by the ARF. The following obser- 
vation shows that two transitions that are independent in the concrete state space may not 
commute in the abstract state space. 
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Figure 8: Independent transitionsdo not commute inabstract state space. 



For simplicity of presentation, we represent an abstract state by a formula representing 
a region. Let 51,52 be global variables, and p, q be predicates such that p (gi < 52) and 
q 44> (gl = g2). Let a be the transition gi : = g\ - 1 and /3 be the transition 52 := 52 ~ 1- 
It is obvious that a and f3 are independent of each other. However, Figure [5] shows that the 
two transitions do no commute when we start from an abstract state r/i such that rji <^=> p. 
The edges in the figure represent the computation of abstract strongest post-condition of 
the corresponding abstract states and transitions. 

Even though two independent transitions do not commute in the abstract state space, 
they still commute in the concrete state space overapproximated by the abstract state space, 
as shown by the lemma below. 

Lemma 5.7. Let a and (3 be transitions that are independent of each other such that 
for concrete states si, 52,53 and abstract state n we have s\ (= r), and both a(s\,S2) and 
P( s 2i s 3) hold. Let rf be the abstract successor state of r] by applying the abstract strongest 
post-operator to n and (3, and rj" be the abstract successor state ofn' by applying the abstract 
strongest post- operator to rf and a. Then, there are concrete states S4 and S5 such that: 
/3(si,S4) holds, S4 |= 7/, /3(s4,S5) holds, 55 |= n" , and S3 = S5. 

The above lemma shows that POR can be applied in the abstract state space. Let 
ESSTpor be the ESST algorithm with POR. The correctness of POR in ESST is stated 
by the following theorem: 

Theorem 5.8. Let P be a threaded sequential program. For every terminating executions of 
ESST(P) and ESST PO r(P), we have that ESST(P) reports safe iff so does ESST POR (P). 



6. Experimental Evaluation 

In this section we show an experimental evaluation of the ESST algorithm in the verification 
of multi-threaded programs in the FairThreads [B011O6] programming framework. The aim 
of this evaluation is to show the effectiveness of ESST and of the partial-order reduction 
applied to ESST. By following the same methodology, the ESST algorithm can be adapted 
to other programming frameworks, like SpecC [GDPGOlj and OSEK/VDX [OSE05] . with 
moderate effort. 

6.1. Verifying FairThreads. FairThreads is a framework for programming multi-threaded 
software that allows for mixing both cooperative and preemptive threads. As we want to 
apply ESST, we only deal with the cooperative threads. FairThreads includes a scheduler 
that executes threads according to a simple round-robin policy. FairThreads also provides 
a programming interface that allows threads to synchronize and communicate with each 
others. Examples of synchronization primitives of FairThreads are as follows: await (e) for 
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Figure 9: The scheduler of Fair Threads. 



waiting for the notification of event e if such a notification does not exist, generate (e) for 
generating a notification of event e, cooperate for yielding the control back to the scheduler, 
and join(i) for waiting for the termination of thread t. 

The scheduler of FairThreads is shown in Figure EJ At the beginning all threads are 
set to be runnable. The executions of threads consist of a series of instants in which the 
scheduler runs all runnable threads, in a deterministic round-robin fashion, until there are 
no more runnable threads. 

A running thread can yield the control back to the scheduler either by waiting for an 
event notification (await), by cooperating (cooperate), or by waiting for another thread to 
terminate (join). A thread that executes the primitive await (e) can observe the notification 
of e even though the notification occurs long before the execution of the primitive, so long 
as the execution of await (e) is still in the same instant of the notification of e. Thus, the 
execution of await does not necessarily yields the control back to the scheduler. 

When there are no more runnable threads, the scheduler enters the end-of-instant phase. 
In this phase the scheduler wakes up all threads that had cooperated during the last instant, 
and also clears all event notifications. The scheduler then starts a new instant if there are 
runnable threads; otherwise the execution ends. 

The operational semantics of cooperative FairThreads has been described in |Bou02| . 
However, it is not clear from the semantics whether the round-robin order of the thread 
executions remains the same from one instant to the other. Here, we assume that the order 
is the same from one instant to the other. The operational semantics does not specify 
either the initial round-robin order of the thread executions. Thus, for the verification, one 
needs to explore all possible round-robin orders. This situation could easily degrade the 
performance of ESST and possibly lead to state explosion. The POR techniques described 
in Section [5] could in principle address this problem. 

In this section we evaluate two software model checking approaches for the verification 
of FairThreads programs. In the first approach we rely on a translation from FairThreads 
into sequential programs (or sequentialization) , such that the resulting sequential programs 
contain both the mapping of the cooperative threads in the form of functions and the 
encoding of the FairThreads scheduler. The thread activations are encoded as function 
calls from the scheduler function to the functions that correspond to the threads. The 
program can be thought of as jumping back and forth between the "control level" imposed 
by the scheduler, and the "logical level" implemented by the threads. Having the sequential 
program, we then use off-the-shelf software model checkers to verify the programs. 

In the second approach we apply the ESST algorithm to verify FairThreads programs. 
In this approach we define a set of primitive functions that implement FairThreads synchro- 
nization functions, and instantiate the scheduler of ESST with the FairThreads scheduler. 
We then translate the FairThreads program into a threaded program such that there is 
a one-to-one correspondence between the threads in the FairThreads program and in the 
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resulting threaded program. Furthermore, each call to a FairThreads synchronization func- 
tion is translated into a call to the corresponding primitive function. The ESST algorithm 
is then applied to the resulting threaded program. 



6.2. Experimental evaluation setup. The ESST algorithm has been implemented in the 



Kratos software model checker [CGM + fT . In this work we have extended Kratos with 
the FairThreads scheduler and the primitive functions that correspond to the FairThreads 
synchronization functions. 

We have carried out a significant experimental evaluation on a set of benchmarks taken 
and adapted from the literature on verification of cooperative threads. For example, the 
fact* benchmarks are extracted from |JBGT10] . which describes a synchronous approach to 
verifying the absence of deadlocks in FairThreads programs. We adapted the benchmarks 
by recoding the bad synchronization, that can cause deadlocks, as an unreachable false 
assertion. The gear-box benchmark is taken from the case study in [WH08] . This case 
study is about an automated gearbox control system that consists of a five-speed gearbox 
and a dry clutch. Our adaptation of this benchmark does not model the timing behavior of 
the components and gives the same priority to all tasks (or threads) of the control system. 
In our case we considered the verification of safety properties that do not depend on the 
timing behavior. Ignoring the timing behavior in this case results in more non-determinism 
than that of the original case study. The f t-pc-sf if o* and f t— token-ring* benchmarks 
are taken and adapted from, respectively, the pc-sf if o* and token-ring* benchmarks used 
in [CMNR10, CNRllj. All considered benchmarks satisfy the restriction of ESST: the 
arguments passed to every call to a primitive function are constants. 

For the sequentialized version of FairThreads programs, we experimented with several 
state-of-the-art predicate-abstraction based software model checkers, including SatAbs- 



3.0 [CKSY05] . CpaChecker |BKllj . and the sequential analysis of Kratos [CGM+11 



We also experimented with CBMC-4.0 |CKL04j for bug hunting with bounded model check- 
ing (BMC) [BCC Z99], For the BMC experiment, we set the size of loop unwindings to 5 
and consider only the unsafe benchmarks. All benchmarks and tools' setup are available at 
http : //es . f bk . eu/people/roveri/tests/ jlmcs-esst, 

We ran the experiments on a Linux machine with Intel-Xeon DC 3GHz processor and 
4GB of RAM. We fixed the time limit to 1000 seconds, and the memory limit to 4GB. 



6.3. Results of Experiments. The results of experiments are shown in Table HJ for the 
run times, and in Table [21 for the numbers of explored abstract states by ESST. The 
column V indicates the status of the benchmarks: S for safe and U for unsafe. In the 
experiments we also enable the POR techniques in ESST. The column No-POR indicates 
that during the experiments POR is not enabled. The column P-POR indicates that only 
the persistent set technique is enabled, while the column S-POR indicates that only the 
sleep set technique is enabled. The column PS-POR indicates that both the persistent set 
and the sleep set techniques are enabled. We mark the best results with bold letters, and 
denote the out-of-time results by T.O. 

The results clearly show that ESST outperforms the predicate abstraction based se- 
quentialization approach. The main bottleneck in the latter approach is the number of 
predicates that the model checkers need to keep track of to model details of the scheduler. 
For example, on the f t-pc-sf if ol . c benchmark SatAbs, CpaChecker, and the sequential 
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Sequentialization 


ESST 


Name 


V 


SatAbs 


CpaChecker 


Kratos 


CBMC 


No-POR 


P-POR 


S-POR 


PS-POR 


factl 


S 


9.07 


14.26 


2.90 




0.01 


0.01 


0.01 


0.01 


factl-bug 


U 


22.18 


8.06 


0.39 


15.09 


0.01 


0.01 


0.01 


0.03 


factl-mod 


S 


4.41 


8.18 


0.50 




0.40 


0.40 


0.39 


0.39 


fact2 


S 


69.05 


17.25 


15.40 




0.01 


0.01 


0.01 


0.01 


gear-box 


S 


T.O 


T.O 


T.O 




T.O 


473.55 


44.89 


44.19 


ft-pc-sfifol 


S 


57.08 


56.56 


44.49 




0.30 


0.30 


0.29 


0.29 


ft-pc-sfifo2 


S 


715.31 


T.O 


T.O 




0.39 


0.39 


0.30 


0.39 


ft-token-ring.3 


S 


115.66 


T.O 


T.O 




0.48 


0.29 


0.20 


0.20 


ft-token-ring.4 


S 


448.86 


T.O 


T.O 




5.20 


1.10 


0.29 


0.29 


ft-token-ring.5 


S 


T.O 


T.O 


T.O 




213.37 


6.20 


0.50 


0.40 


ft-tokcn-ring.6 


S 


T.O 


T.O 


T.O 


_ 


T.O 


92.39 


0.69 


0.49 


ft-token-ring.7 


S 


T.O 


T.O 


T.O 




T.O 


T.O 


0.99 


0.80 


ft-token-ring.8 


S 


T.O 


T.O 


T.O 




T.O 


T.O 


1.80 


0.89 


ft-token-ring.9 


S 


T.O 


T.O 


T.O 




T.O 


T.O 


3.89 


1.70 


ft-token-ring.10 


S 


T.O 


T.O 


T.O 




T.O 


T.O 


9.60 


2.10 


ft-tokcn-ring-bug. 3 


U 


111.10 


T.O 


T.O 


158.76 


0.10 


0.10 


0.10 


0.10 


ft-token-ring-bug.4 


U 


306.41 


T.O 


T.O 


*407.36 


1.70 


0.30 


0.10 


0.10 


ft-token-ring-bug. 5 


U 


860.29 


T.O 


T.O 


♦751.44 


66.09 


1.80 


0.10 


0.10 


ft-token-ring-bug. 6 


U 


T.O 


T.O 


T.O 


T.O 


T.O 


26.29 


0.20 


0.10 


ft-token-ring-bug. 7 


U 


T.O 


T.O 


T.O 


T.O 


T.O 


T.O 


0.30 


0.20 


ft-token-ring-bug. 8 


U 


T.O 


T.O 


T.O 


T.O 


T.O 


T.O 


0.60 


0.29 


ft-token-ring-bug. 9 


U 


T.O 


T.O 


T.O 


T.O 


T.O 


T.O 


1.40 


0.60 


ft-token-ring-bug. 10 


U 


T.O 


T.O 


T.O 


T.O 


T.O 


T.O 


3.60 


0.79 



Table 1: Run time results of the experimental evaluation (in seconds). 



analysis of Kratos needs to keep track of, respectively, 71, 37, and 45 predicates. On the 
other hand, ESST only needs to keep track of 8 predicates on the same benchmark. 

Regarding the refinement steps, ESST needs less abstraction-refinement iterations than 
other techniques. For example, starting with the empty precision, the sequential analysis 
of Kratos needs 8 abstraction-refinement iterations to verify f act2, and 35 abstraction- 
refinement iterations to verify ft-pc-sfifol. ESST, on the other hand, verifies fact2 
without performing any refinements at all, and verifies ft-pc-sfifol with only 3 abstraction- 
refinement iterations. 

The BMC approach, represented by CBMC, is ineffective on our benchmarks. First, 
the breadth-first nature of the BMC approach creates big formulas on which the satisfiability 
problems are hard. In particular, CBMC employs bit-precise semantics, which contributes 
to the hardness of the problems. Second, for our benchmarks, it is not feasible to identify the 
size of loop unwindings that is sufficient for finding the bug. For example, due to insufficient 
loop unwindings, CBMC reports safe for the unsafe benchmarks ft-token-ring-bug.4 and 
ft-token-ring-bug. 5 (marked with "*"). Increasing the size of loop unwindings only results 
in time out. 

Tabled] also shows that the POR techniques boost the performance of ESST and allow 
us to verify benchmarks that could not be verified given the resource limits. In particular 
we get the best results when the persistent set and sleep set techniques are applied together. 
Additionally, Table [2] shows that the POR techniques reduce the number of abstract states 
explored by ESST. This reduction also implies the reductions on the number of abstract 
post computations and on the number of coverage checks. 
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Name 


No-POR 


P-POR 


S-POR 


PS-POR 


factl 


66 


66 


66 


66 


factl-bug 


49 


49 


49 


49 


factl-mod 


269 


269 


269 


269 


fact2 


49 


49 


29 


29 


gear-box 


_ 


204178 


60823 


58846 


ft-pc-sfifol 


180 


180 


180 


180 


ft-pc-sfifo2 


540 


287 


310 


287 


ft-token-ring.3 


1304 


575 


228 


180 


ft-token-ring.4 


7464 


2483 


375 


266 


ft-token-ring.5 


50364 


7880 


699 


395 


ft-token-ring.6 


_ 


32578 


1239 


518 


ft-tokcn-ring.7 


_ 


_ 


2195 


963 


ft-token-ring.8 






4290 


1088 


ft-token-ring.9 






8863 


2628 


ft-tokcn-ring. 10 






16109 


3292 


ft-tokcn-ring-bug.3 


496 


223 


113 


89 


ft-tokcn-ring-bug.4 


2698 


914 


179 


125 


ft-tokcn-ring-bug.5 


17428 


2801 


328 


181 


ft-token-ring-bug.6 




11302 


611 


251 


ft-token-ring-bug.7 






1064 


457 


ft-tokcn-ring-bug.8 






2133 


533 


ft-tokcn-ring-bug.9 






4310 


1281 


ft-tokcn-ring-bug. 10 






8039 


1632 



Table 2: Numbers of explored abstract states. 



Despite the effectiveness showed by the obtained results, the following remarks are 
in order. POR, in principle, could interact negatively with the ESST algorithm. The 
construction of ARF in ESST is sensitive to the explored scheduler states and to the 
tracked predicates. POR prunes some scheduler states that ESST has to explore. However, 
exploring such scheduler states can yield a smaller ARF than if they are omitted. In 
particular, for an unsafe benchmark, exploring omitted scheduler states can lead to the 
shortest counter-example path. Furthermore, exploring the omitted scheduler states could 
lead to spurious counter-example ARF paths that yield predicates that allow ESST to 
perform less refinements and construct a smaller ARF. 

6.4. Verifying SystemC. SystemC is a C++ library that has widely been used to write 
executable models of systems-on-chips. The library consists of a language to model the 
component architecture of the system and also to model the parallel behavior of the system 
by means of sequential threads. Similar to FairThreads, the SystemC scheduler employs a 
cooperative scheduling, and the execution of the scheduler is divided into a series of so-called 
delta cycles, which correspond to the notion of instant. 

Despite their similarities, the scheduling policy and the behavior of synchronization 
primitives of SystemC and FairThreads have significant differences. For example, the 
FairThreads scheduler employs a round-robin scheduling, while the SystemC scheduler can 
execute any runnable thread. Also, in FairThreads a notification of an event performed by 
some thread can later still be observed by another thread, as long as the execution of the 
other thread is still in the same instant as the notification. In SystemC the latter thread 
will simply miss the notification. 
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In [CMNR10, CNR11], we report on the application of ESST to the verification of Sys- 
temC models. We follow a similar approach, comparing ESST and the sequentialization 
approach, and also experimenting with POR in ESST. The results of those experiments 
show the same patterns as the results reported here for Fair Threads: the ESST approach 
outperforms the sequentialization approach, and the POR techniques improve further the 
performance of ESST in terms of run time and the the number of visited abstract states. 
These results allow us to conclude that the ESST algorithm, along with the POR tech- 
niques, is a very effective and general technique for the verification of cooperative threads. 

7. Related Work 

There have been a plethora of works on developing techniques for the verification of multi- 
threaded programs, both for general ones and for those with specific scheduling policies. 
Similar to the work in this paper, many of these existing techniques are concerned with 
verifying safety properties. In this section we review some of these techniques and describe 
how they are related to our work. 

7.1. Verification of Cooperative Threads. Techniques for verifying multi-threaded pro- 
grams with cooperative scheduling policy have been considered in different application 
domains: [MMMC051 IGD051 IKS051 ITCMM071 IHFG081 IBK081 ICMNRIO] for SystemC, 
[JBGTlOj for FairThreads, |WH08j for OSEK/VDX, and |CJK07j for SpecC. Most of 
these techniques either embed details of the scheduler in the programs under verification 
or simply abstract away those details. As shown in [CMNR10], verification techniques that 
embed details of the scheduler show poor scalability. On the other hand, abstracting away 
the scheduler not only makes the techniques report too many false positives, but also lim- 
its their applicability. The techniques described in |MMMC051 ITCMM071 IHFG08] only 
employs explicit-state model checking techniques, and thus they cannot handle effectively 
infinite-domain inputs for threads. ESST addresses these issues by analyzing the threads 
symbolically and by orchestrating the overall verification by direct execution of the scheduler 
that can be modeled faithfully. 

7.2. Thread- modular Model Checking. In the traditional verification methods, such 
as the one described in [OG76| . safety properties are proved with the help of assertions 
that annotate program statements. These annotations form the pre- and post-conditions 
for the statements. The correctness of the assertions is then proved by proof rules that are 
similar to the Floyd-Hoare proof rules [Flo671 lHoa83| for sequential programs. The method 
in [OG76J requires a so-called interference freedom test to ensure that no assertions used in 
the proof of one thread are invalidated by the execution of another thread. Such a freedom 
test makes this method non-modular (each thread cannot be verified in isolation from other 
threads) . 

Jones [Jon83| introduces thread-modular reasoning that verifies each thread separately 
using assumption about the other threads. In this work the interference information is in- 
corporated into the specifications as environment assumptions and guarantee relations. The 
environment assumptions model the interleaved transitions of other threads by describing 
their possible updates of shared variables. The guarantee relations describe the global state 
updates of the whole program. However, the formulation of the environment assumptions 
in [Jon83] and |OG76] incurs a significant verification cost. 
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Flanagan and Qadeer |FQ03 1 describe a thread- modular model checking technique that 
automatically infers environment assumptions. First, a guarantee relation for each of the 
thread is inferred. The assumption relation for a thread is then the disjunction of all the 
guarantee relations of the other threads. Similar to ESST, this technique computes an over- 
approximation of the reachable concrete states of the multi-threaded program by abstraction 
using the environment assumptions. However, unlike ESST, the thread-modular model 
checking technique is incomplete since it can report false positives. 

Similar to ESST, the work in |HJMQ03| describes a CEGAR-based thread-modular 
model checking technique, that analyzes the data-flow of each thread symbolically using 
predicate abstraction, starting from a very coarse over-approximation of the thread's data 
states and successively refining the approximation using predicates discovered during the 
CEGAR loop. Unlike ESST, the thread-modular algorithm also analyzes the environment 
assumption symbolically starting with an empty environment assumption and subsequently 
weakening it using the refined abstractions of threads' data states. 

Chaki et. al. [CQYC03] describe another CEGAR-based model checking technique. 
Like ESST, the programs considered by this technique have a fixed number of threads. 
But, unlike other previous techniques that deal with shared- variable multi-threaded pro- 
grams, the threads considered by this technique use message passing as the synchronization 
mechanism. This technique uses two levels of abstractions over each individual thread. 
The first abstraction level is predicate abstraction. The second one, which is applied to 
the result of the first abstraction, is action-guided abstraction. The parallel composition 
of the threads is performed after the second abstraction has been applied. Compositional 
reasoning is used during the check for spuriousness of a counter-example by projecting and 
examining the counter-example on each individual thread separately. 

Recently, Gupta et. al. [GPR11] have proposed a new predicate abstraction and re- 
finement technique for verifying multi-threaded programs Similar to ESST, the technique 
constructs an ART for each thread. But unlike ESST, branches in the constructed ART 
might not correspond to a CFG unwinding but correspond to transitions of the environment. 
The technique uses a declarative formulation of the refinement to describe constraints on 
the desired predicates for thread reachability and environment transition. Depending on 
the declarative formulation, the technique can generate a non-modular proof as in |OG76| 
or a modular proof as in |FQ03| . 



7.3. Bounded Model Checking. Another approach to verifying multi-threaded programs 
is by bounded model checking (BMC) [BCCZ99J. For multi-threaded programs, the bound 
is concerned, not only with the length (or depth) of CFG unwinding, as in the case of 
sequential programs, but also with the number of scheduler invocations or the number of 
context switches. This approach is sound and complete, but only up to the given bound. 
Prominent techniques that exploit the BMC approach include [God05] and |QR05 



The work in [God05] limits the number of scheduler invocations. While the work in QR05 
bounds the number of context switches. That is, given a bound k, the technique verifies 
if a multi-threaded program can fail an assertion through an execution with at most k 
context switches. This technique relies on regular push-down systems ScliOO for a finite 
representation of the unbounded number of stack configurations. The ESST algorithm can 
easily be made depth bounded or context-switch bounded by not expanding the constructed 
ARF node when the number of ARF connectors leading to the node has reached the bound. 
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The above depth bounded and context-switch bounded model checking techniques are 
ineffective in finding errors that appear only after each thread has a chance to complete 
its execution. To overcome this limitation, Musuvathi and Qadeer )MQ07| have proposed 
a BMC technique that bounds the number of context switches caused only by scheduler 
preemptions. Such a bound gives the opportunity for each thread to complete its execution. 

The state-space complexity imposed by the previously described BMC techniques grows 
with the number of threads. Thus, those techniques are ineffective for verifying multi- 
threaded programs that allow for dynamic creations of threads. Recently a technique called 
delay bounded scheduling has been proposed in |EQR11| . Given a bound k, a deterministic 
scheduler is made non-deterministic by allowing the scheduler to delay its next executed 
thread at most k times. The bound k is chosen independently of the number of threads. 
This technique has been used for the analysis and testing of concurrent programs | MQ06 1 . 

SAT/SMT-based BMC has also been applied to the verification of multi-threaded pro- 
grams. In [RG05] a SAT-based BMC that bounds the number of context switches has been 
described. In this work, for each thread, a set of constraints describing the thread is gener- 
ated using BMC techniques for sequential programs [CKL04j. Constraints for concurrency 
describing both the number of context switches and the reading or writing of global variables 
are then added to the previous sets of constraints. The work in [GG08] is also concerned 
with efficient modeling of multi-threaded programs using SMT-based BMC. Unlike [RG05J, 
in this work the constraints for concurrency are added lazily during the BMC unrolling. 

7.4. Verification via Sequentialization. Yet another approach used for verifying multi- 
threaded programs is by reducing bounded concurrent analysis to sequential analysis. In 
this approach the multi-threaded program is translated into a sequential program such 
that the latter over-approximates the bounded reachability of the former. The resulting 
sequential program can then be analyzed using any existing model checker for sequential 
programs. 

This approach has been pioneered by the work in [ HjW04| . In this work a multi-threaded 
program is converted to a sequential one that simulates all the interleavings generated by 
multiple stacks of the multi-threaded program using its single stack. The simulation itself 
is bounded by the size of a multiset that holds existing runnable threads at any time during 
the execution of a thread. 

Lai and Reps [LR09J propose a translation from multi-threaded programs to sequential 
programs that reduces the context-bounded reachability of the former to the reachability of 
the latter for any context bound. Given a bound k, the translation constructs a sequential 
program that tracks, at any point, only the local state of one thread, the global state, 
and k copies of the global state. In the translation each thread is processed separately 
from the others, and updates of global states caused by context switches in the processed 
thread are modeled by guessing future states using prophecy variables and constraining 
these variables at an appropriate control point in the execution. Due to the prophecy 
variables, the resulting sequential program explores more reachable states than that of the 
original multi-threaded program. A similar translation has been proposed in [TMP09J. But 
this translation requires the sequential program to call the individual thread multiple times 
from scratch to recompute the local states at context switches. 

As shown in Section El and also in [CMNRIO], the verification of multi-threaded pro- 
grams via sequentialization and abstraction-based software model checking techniques turns 
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out to suffer from several inefficiencies. First, the encoding of the scheduler makes the se- 
quential program more complex and harder to verify. Second, details of the scheduler are 
often needed to verify the properties, and thus abstraction-based technique requires many 
abstraction-refinement iterations to re- introduce the abstracted details. 



7.5. Partial-Order Reduction. POR is an effective technique for reducing the search 
space by avoiding visiting redundant executions. It has been mostly adopted in explicit-state 
model checkers, like SPIN |Hol051 IHP951 IPel96j . VeriSoft |God05j . and Zing |AQR+04] . 
Despite their inability to handle infinite-domain inputs, the maturity of these model check- 
ers, in particular the support for POR, has attracted research on encodings of multi- 
threaded programs into the language that the model checkers accept. In [CCNRlT] we 
verify SystemC models by encoding them in Promela, the language accepted by the SPIN 
model checker. The work shows that the resulting encodings lose the intrinsic structures of 
the multi-threaded programs that are important to enable optimizations like POR. 

There have been several attempts at applying POR to symbolic model checking tech- 
niques |ABH+0ll IKGS061 IWYKG08| . In these applications POR is achieved by statically 
adding constraints describing the reduction technique into the encoding of the program. 
The work in [A BH + 0l] apply POR technique to symbolic BDD-based invariant checking. 
While the work in [WYKG08] describes an approach that can be considered as a symbolic 
sleep-set based technique. They introduce the notion of guarded independence relation, 
where a pair of transitions are independent of each other if certain conditions specified in 
the pair's guards are satisfied. The POR techniques applied into ESST can be extended 
to use guarded independence relation by exploiting the thread and global regions. Finally, 
the work in [KGS06J uses patterns of lock acquisition to refine the notion of independence 
transition, which subsequently yields better reductions. 



8. Conclusions and Future Work 

In this paper we have presented a new technique, called ESST, for the verification of shared- 
variable multi-threaded programs with cooperative scheduling. The ESST algorithm uses 
explicit-state model checking techniques to handle the scheduler, while analyzes the threads 
using symbolic techniques based on lazy predicate abstraction. Such a combination allows 
the ESST algorithm to have a precise model of the scheduler, to handle it efficiently, and 
also to benefit from the effectiveness of explicit-state techniques in systematic exploration 
of thread interleavings. At the same time, the use of symbolic techniques allows the ESST 
algorithm to deal with threads that potentially have infinite state space. ESST is futher 
enhanced with POR techniques, that prevents the exploration of redundant thread in- 
terleavings. The results of experiments carried out on a general class of benchmarks for 
SystemC and FairThreads cooperative threads clearly shows that ESST outperforms the 
verification approach based on sequentialization, and that POR can effectively improve the 
performance. 

As future work, we will proceed along different directions. We will experiment with lazy 
abstraction with interpolants [McM06j, to improve the performance of predicate abstraction 
when there are too many predicates to keep track of. We will also investigate the possibility 
of applying symmetry reduction [DKKW11] to deal with cases where there are multiple 
threads of the same type, and possibly with parametrized configurations. 
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We will extend the ESST algorithm to deal with primitive function calls whose argu- 
ments cannot be inferred statically. This requires a generalization of the scheduler explo- 
ration with a hybrid (explicit-symbolic or semi-symbolic) approach, and the use of SMT 
techniques to enumerate all possible next states of the scheduler. Finally, we will look into 
the possibility of applying the ESST algorithm to the verification of general multi-threaded 
programs. This work amounts to identifying important program locations in threads where 
the control must be returned to the scheduler. 
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Appendix A. Proofs of Lemmas and Theorems. 

Lemma (|4.6p . Let 77 and rf be ARF nodes for a threaded program P such that rf is a 
successor node of n. Let 7 be a configuration of P such that 7 |= 77. The following properties 
hold: 

(1) If rf is obtained from 77 by the rule \E1\ with the performed operation op, then, for any 
configuration 7' of P such that 7 7', we have 7' |= rf . 

(2) If rj' is obtained from rj by the rule \ESX then, for any configuration 7' of P such that 
7—^7' and the scheduler states of rj and 7' coincide, we have 7' |= 77'. 

Proof. We first prove property (1). Let 77 and 7/ be as follows: 

V = {(h,<Pl),---,{k,<Pi),---{lN,<PN), ( P,&) 

such that S(stJ = Running and for all j ^ i, we have S(stv) 7^ Running. Let G77 = 
(L, E, Iq, L err ) be the CFG for Tj such that (li,op, I'j) £ E. Let 7 and 7' be as follows: 

7 = ((Zi,si),...,(Zi,Si)>--->(^> s AO ) ff s > s > 

V = (Gi ) JSi),...,(Z<,Si),...,(Zjv,sjv),5a / ,S // ), 

such that 7-^7'. We need to prove that 7' \= rf . Let op be 07? if op contains no primitive 
function call, or be op' as in the second case of the rule IE II First, from 7 |= 77, we have 
SiDgs \= (fi. By the definition of operational semantics of dp and the definition of SPo P ((fi), 
it follows that s-Ugs' |= SP op (ipi). Since SP op (ipi) implies SPj p (ipi) for any precision tt, and 

is SP^' (</7j) for some precision tt^ associated with l[, it follows that s'^Ugs' \= ip^. A similar 
reasoning can be applied to prove that s'j U gs' (= ^ for 7 / i and Ui=i at s i U 5 s ' H V 9 '- 
We remark that the HAVOC (dp) operation only makes the values of global variables possibly 
assigned in op unconstrained. To prove that 7' \= rf , it remains to show that §' and S" 
coincide. Now, consider the case where op does not contain any call to primitive function. It 
is then trivial that §' = §". Otherwise, if dp contains a call to primitive function, then, since 
the primitive executor follows the operational semantics, that is, Sexec(S, f(x)) computes 
[/(x)J(-, •,§), we have §' = S". Hence, we have proven that 7' (= 77'. 
For property (2), let 77 and 77' be as follows: 

77 = ((ii 3 ^i),... 3 (Zjv,y?jv),V 3 s ) 
rf = ((h,^),... ,(l N ,ip N ),ip,§'), 

such that S(sTj) 7^ Running for alH = 1, . . . , N. Let 7 and 7' be as follows: 

7 = ((h,si),...,(lN,SN),gs,S) 
i = ((l 1 , 8 ' 1 ),...,(! N ,8 , N ),g 8 ',$"), 

By the operational semantics, we have Sj = s- for all % = 1, . . . , N, and gs = gs'. Since 
§' = S", it follows from 7 |= 77 that 7' |= 77'. □ 

Theorem (|4.7p . Let P be a threaded program. For every terminating execution o/ESST(P), 
we have the following properties: 

(1) If ESST(P) returns a feasible counter-example path p, then we have 7 4 7' for an 
initial configuration 7 and an error configuration 7' of P. 

(2) If ESST(P) returns a safe ARF J 7 , then for every configuration 7 G Reach(P), there 
is an ARF node 77 £ Nodes(J-) such that 7 |= 77. 
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Proof. We first prove property (1). Let the counter-example path p be the sequence 
£1, . . . j£mi such that, for each i = 1, . . . , m, the element £j is either an ART edge or an 
ARF connector. We need to show the existence of a computation sequence 71, . . . , 7m+i 

such that 7j -4 7,+i for alii = 1, . . . , m, and 7 = 71 and 7' = 7 m +i- Let p>, for < j < m, 
denote the prefix £1, • • • , £j of /5. Let ^ be the strongest post-condition after performing the 
operations in the suppressed version of p 3 . That is, tp 3 is SP a ^ (true). For k = 1, . . . , m, 

we need to show that, for any configuration 7^ satisfying ip k ~ 1 and the source node of 
there is a configuration 7^+1 such that 7^+1 satisfies ip k and the target node of 

First, any configuration satisfies true, and thus 7 |= true. By definition of counter- 
example path, the source node of £1 is an initial node rjQ . Any initial configuration satisfies 
the initial node, and thus 7 |= rjQ. Second, take any 1 < k < m, and assume that we have 
a configuration 7^ satisfying ip k ~ 1 and the source node of Consider the case where £& 
is an ART edge obtained by unwinding CFG edge labelled by an operation op. Let dp be 
the label of the ART edge. That is, dp = op if op has no primitive function call; otherwise 
dp = op' where op' is defined in the second case of rule IE11 Since ip m is satisfiable, then so 

is ip k . It means that there is a configuration 7' such that 7^ 7' and 7' \= ip k . Recall that 
the scheduler state of 7^+1 is not constrained by ifi and primitive function calls can only 
modify scheduler states. Thus, there is a configuration 7^+1 that differs from 7' only in the 

scheduler state, such that 7^ — > jk+l an d 7fc+i |= ip k - When op has no primitive function 
call, then we simply take 7' as 7^+1- By Lemma [4.61 it follows that 7^+1 satisfies the target 

node of and hence we have 7^ -4 7^+1, as required. 

Consider now the case where £& is an ARF connector. The connector ^ is suppressed 
in the computation of the strongest post-condition, that is ip k is ip k ~ 1 . We obtain 7^+1 
from 7^ by replacing 7/%'s scheduler state with the scheduler state in the target node of 
Since free variables of ip k do not range over variables tracked by the scheduler state and 
Ik H 4 )k ~ 1 i we have 7fc+i |= ip k . By the construction of jk+i an d by Lemma 14.61 it follows 

that 7^+1 satisfies the target node of and hence we have 7^ -4 7^+1, as required. 

We now prove property (2). We prove that, for any run 70,71, • • • of P and for any 
configuration ji in the run, there is a node rj € Nodes (J-) such that 7» |= 77. We prove the 
property by induction on the length I of the run: 

Case 1 = 1: This case is trivial because the initial configuration 70 satisfies the initial node, 

and the construction of an ARF starts with the initial node. 
Case I > 1: Let rj £ Nodes (J-) be an ARF node such that the configuration 7„ |= i]. If r\ 

is covered by another node rj € Nodes (J 7 ), then, by Definition 14.31 of node coverage, we 

have 72 |= rf . Thus, we pick such an ARF node rj such that it is not covered by other 

nodes. 

Consider the transition 72 % 7^+1 from 7/ to 71+1. By the rule IE11 the node 77 has 
a successor node rj' obtained by performing the operation op. By Lemma 14.61 we have 
7/+1 N V) as required. 

Now, consider the transition 7^ ^> 7/+i- Because the scheduler Sched implements 
the function Sched in the operational semantics, then, by the rule IE21 the node 77 has 
a successor node rj whose scheduler state coincide with 7/+1. By Lemma I4T61 we have 
72+1 H if} as required. □ 
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Theorem (15. 4j) . A transition system M = (S,Sq,T) is safe w.r.t. a set T err QT of error 
transitions iff Reach Te( i{So,T) that satisfies the cycle condition does not contain any error 
state from E M , r err ■ 

Proof. If the transition system M is safe w.r.t. T err , then Reach re d(So,T) n pM,T crr = 
follows obviously because Reach(So,T) n pM,T err = and Reach re d(So,T) C Reach(So,T). 

For the other direction, let us assume the transition system M being unsafe w.r.t. T err . 
Without loss of generality we also assume that T err = {a}. We prove that for every state 
so £ S such that there is a path of length n > leading to an error state s e , then there 
is a path from sq to an error state s' e such that the path consists only of transitions in the 
persistent sets of visited states. When the state so is in So, then the states visited by the 
latter path are only states in Reach re d(So,T). We first show the proof for n = 1 and n = 2, 
and then we generalize it for arbitrary n > 1. 

Case n = 1: Let sq € S be such that so -4 s e holds for an error state s e . By the successor- 
state condition, the persistent set in so is non-empty. If the only persistent set in so is 
the singleton set {a}, then the path so — > s e is the path leading to an error state and the 
path consists only of transitions in the persistent sets of visited states. Suppose that the 
transition a is not in the persistent set in sq. Take the greatest m > such that there is 
a path 

70 7X 7m- 1 

s -4 si -4 ■ ■ ■ ->■ s m , 
where for all i = 0, . . . , m — 1, the set P, is the persistent set in state S{, the transition 
7j is in Pj, and the transition a in not in Pj (see Figure [TO]) . First, the above path exists 
because of the successor-state condition and it must be finite because the set S of states 
is finite. The path cannot form a cycle, otherwise by the cycle condition the transition 
a will have been in the persistent set in one of the states that form the cycle. That 
is, by the above path, we delay the exploration of a as long as possible. Second, since 
the transition a is enabled in sq and is independent in of any transition in Pj for all 
i = 0, . . . , m — 1 (otherwise Pj is not a persistent set), then a remains enabled in Sj for 
j = 1, ... ,ra. Third, since m is the greatest number, we have a in the persistent set in 
the state s m , and furthermore s m —> s' e holds for an error state s' e . Thus, the path 

70 7m- 1 a , 

s -4 • • • ->■ s m ->• s e 

is the path from so leading to an error state s' e involving only transitions in the persistent 
sets of visited states. 
Case n = 2: Let sq £ S be such that there is a path 

Pq i Pi= a 

S S 1 -> S e 

for some state s' x and an error state s e . By the successor-state condition, the persistent 
set in so is non-empty. If the only persistent set in so is the singleton set {/3o}j then the 

path so -4 s' x consists only of transition in the persistent set. By the case n = 1, it is 
guaranteed that there is a path from s' x leading to an error state s' e such that the path 
consists only of transitions in the persistent sets of visited states. Thus, there is a path 
from so leading to an error state s' e such that the path consists only of transitions in the 
persistent sets of visited states. 
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Suppose that the transition /3o is not in the persistent set in so- Take the greatest 
m > such that there is a path 

70 71 7m- 1 

s -4 «i • • • -4 s m , 

where for alH = 0, . . . , m — 1, the set Pj is the persistent set in state Sj, the transition 7, 
is in Pj, and the transition (3q in not in Pj (see Figure [TO]) . With the same reasoning as 
in the case of n = 1, the above path exists, and is finite and acyclic. That is, we delay 
the exploration of fio as long as possible. 
Consider now the path 

70 71 Tm-l Pa , 
S ^S!^--- -)■ s m ^¥s m+1 . 

We show that an error state is reachable from the state s' m+1 . First, since the transitions 
70 and fio are independent in so, the transitions 70 and /?o are enabled, respectively, 
in the states s'i and s±, and they commute in the state s' 2 . The transition 70 is also 
independent of the transition a in s^, otherwise Po is not a persistent set in so- Thus, 
the transition a is enabled in s 2 . Second, since the transitions 71 and (3q are independent 
in si, the transitions 71 and fio are enabled, respectively, in the states s 2 and S2, and 
they commute in the state S3. The transition 71 is independent of the transition a in s 2 , 
otherwise Pi is not a persistent set in s\. Thus, the transition a is enabled in s 3 . 

By repeatedly applying the above reasoning, it follows that the transition a is enabled 
in the state s' m+1 . If the singleton set {a} is the only persistent set in s' m+1 , then we are 
done. That is, the path 

70 71 7m- 1 Po 1 a 1 

so -4 si -4 • • • ->• s m -4 s m+1 ->• s e 

is the path from so leading to an error state s' e such that it consists only of transitions 
in the persistent sets of visited states. 

In the same way as in the case of n = 1, if the transition a is not in the persistent set 
in s' m+1 , then we can delay a as long as possible by taking the greatest k > such that 
there is a path 

/ 7m / 7m + l 7m + fc-l / 

s m+l ~^ s m+2 —t'" s m+k+li 

where for all Z = 1, . . . , k+1, the set P m +/ is the persistent set in state s' m+l , the transition 
7 m+ ;_i is in P m +h and the transition a in not in P m +h Thus, the path 

71; 7l 7m-l Pa 1 Jm+k-i / a , 

SQ — r Si — f ■ ■ ■ — > S m —}■ S m+1 ■ ■ ■ — > S m+k+l — > S e 

is the path from so leading to an error state s' e such that it consists only of transitions 
in the persistent sets of visited states. 
Case n > 1: Let so € S be such that there is a path 

Po 1 Pi P„-i=a 

s -4 Si ■ • ■ s e 

for some state s[ and an error state s e . By the successor-state condition, the persistent 
set in so is non-empty. If the only persistent set in so is the singleton set {Po}, then the 

path so -4 Si consists only of transition in the persistent set. By the case n — 1, it is 
guaranteed that there is a path from s' x leading to an error state s' e such that the path 
consists only of transitions in the persistent sets of visited states. Thus, there is a path 
from so leading to an error state s' e such that the path consists only of transitions in the 
persistent sets of visited states. 
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Figure 10: Cases of the proof of Theorem 15.41 

Suppose that the transition (3o is not in the persistent set in sq. Take the greatest 
m > such that there is a path 

70 71 7m-i 

s -4 si -4 • ■ ■ ->■ s m , 

where for all i = 0, . . . , m — 1, the set P, is the persistent set in state Sj, the transition 
7i is in Pj, and the transition (3q in not in Pj (see Figure [TU|) , That is, we delay the 
exploration of (3q as long as possible. 
Consider now the path 

70 7X 7m- 1 00 , 

s si • • • -4 s m -4 s m+1 . 

With the same reasoning as in the case of n = 2, we have the transition f3\ enabled in 
the state s' m+1 , and we can postpone the exploration of f3± as long as possible. When f3\ 
gets explored, the transition /?2 is enabled in the successor state. By repeatedly applying 
the same reasoning for transitions /3& for k = 2, . . . , n — 1, the path formed in a similar 
way to that of the case of n = 2 is the path from so leading to an error state s' e such that 
the path consists only of transitions in the persistent sets of visited states. □ 

Lemma (|5.7[) . Let a and (3 be transitions that are independent of each other such that 
for concreate states si,S2,S3 and abstract state n we have si \= n, and both a{s\,S2) and 
P(s2,S3) hold. Let n' be the abstract successor state of r\ by applying the abstract strongest 
post-operator to n and f3, and rj" be the abstract successor state ofr{ by applying the abstract 
strongest post- operator to n' and a. Then, there are concrete states S4 and S5 such that: 
/3(si,S4) holds, S4 |= vj ', /3(s4,S5) holds, S5 |= rj' , and S3 = S5. 

Proof. By the independence of a and f3, we have /3(si, S4) holds. By the abstract strongest 
post-operator, we have S4 |= rj '. By the independence of a and /3, we have /3(s4,S5) holds. 
By the abstract strongest post-operator and the fact that S4 \= rj' , we have S5 \= rj' . Finally 
by the independence of a and /3, we have S3 = S5. □ 

Theorem (]5.8f) . Let P be a threaded sequential program. For every terminating execu- 
tions of ESST(P) and ESSTpor(P), we have that ESST(P) reports safe iff so does 
ESSTpor(P). 
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Proof. First, we first prove the left-to-right direction of iff and then prove the other direction. 

(=>■) : Assume that ESST(P) returns a safe ARF T . Assume to the contrary that 
ESSTpoi? reports unsafe and returns a counter-example path p. By Theorem 14. 7\ we have 

7 A 7' for an initial configuration 7 and an error configuration 7' of P. That is, the error 
configuration is in Reach(P). Again, by Theorem 14.71 there is an ARF node rj € Nodes (J-) 
such that 7' |= rj. But then the node rj is an error node, and J- is not safe, which contradicts 
our assumption that T is safe. 

(■^=) : We lift Theorem 15.41 and its proof to the case of abstract transition system 
or abstract state space with the help of Lemma 15.71 The lifting amounts to establishing 
correspondences between the transition system M = (S, Sq,T) in Theorem [53] and the ARF 
constructed by ESST and ESSTpop. First, since the executions are terminating, the set of 
reachable scheduler states is finite. Now let the set of ARF nodes reachable by the rules lEll 
and IE2I correspond to the set S. That is, the set S is now the set of ARF nodes. The set 
So contains only the initial node. A transition in T represents either an ART path p that 
starts from the root of the ART and ends with a leaf of the ART, or an ARF connector. 
The error transitions T err contains every transition in T such that the transition represents 
an ART path p with an error node as the end node. The set EM,T err consists of error nodes. 
Every path so si — ^ • • • —> 1 s n , corresponds the the following path in the ARF: 

(1) for i = 0, . . . , n, the node Sj is a node in the ARF, 

(2) for i = 0, . . . , n — 1, there is an ARF path from Sj to Sj+i that is represented by the 
transition aj, and 

(3) for i = 0, . . . , n — 1, if the transition oti leads to a node s covered by another node s' , 
then Sj+i is s' . 

We now exemplify how we address the issue of commutativity in the proof of Theorem 15.41 
Consider the case n = 2 where transitions 70, /3o an d A), 70 commute in s' 2 - In the case 
of abstract state space, they might not commute. However, by Lemma 15.71 they commute 
in the concrete state space. Thus, the transition a is still enabled after performing the 
transitions 70, /3q. □ 
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